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Preface 


This text has developed from courses that I have taught on analysis to university 
students of mathematics in their first semester. However, it has grown to be much 
more than a course for first year students. Although the 11 chapters (beginning at 
Chap. 2) include the material usually to be found in beginning courses of analysis, I 
have also had further objectives that are not usually communicated to beginning 
students. 

In my view mathematics underwent a transformation in the late nineteenth 
century and early twentieth, indeed a revolution, that could not have been foreseen. 
Previously there had not been general agreement about standards of proof except 
perhaps in classical Euclidean geometry. The logical basis of arguments used to 
prove results about calculus and infinite series, indeed about most mathematics 
since the Renaissance, was not understood, and there was some anxiety as to 
whether the edifice of mathematical knowledge might come crashing down. Some 
mathematicians issued warnings, but the speed and momentum of new discoveries 
was fortunately unstoppable, and the results seemed correct; they certainly passed 
all empirical tests of correctness. The idea that analysis was transformed, from a 
subject with shaky foundations, to become a flagship of correct mathematical 
argument, and that this change occurred over a rather short period, is something that 
I regard as important for understanding its nature. Suddenly there was general 
agreement over what constituted a correct proof, provided only that the details 
could be taken in by a human reader. Historically, analysis was a great success 
story. 

The objective implied by the last paragraph, to communicate to the reader the 
success of analysis in overcoming previously held doubts, is not attained by 
pedantic rigour or a painstaking level of formality. Nor does it call for any genuine 
attempt to recount the history of analysis or follow a historical development of the 
subject. Nor does it call for any novel approach to any topic. It does, though, colour 
the way the topics are presented. It calls for clarity and meaning in the proofs, 
honesty about what has been achieved and the ever present awareness that the 
reader is an intelligent adult who genuinely is trying to understand what analysis is 
about and why it is important. 


Vii 


Viii Preface 


Set theory has played a basic role in the evolution of analysis. A text of this kind 
has to present a certain amount of set theory at the outset. But a proper axiomatic 
treatment of set theory would alienate many readers. The alternative to this has 
usually been the awfully named naive set theory, involving a careless approach to 
difficult ideas. But this too can alienate readers (though probably not the same 
group of readers). Some middle approach is needed that is honest about set theory 
but not pedantically detailed. The reader should be made aware that there is a need 
for clear principles for building sets, including many sets that mathematicians take 
for granted, even if these principles are not all explained in detail. How can anal- 
ysis, as it is usually understood, exist unless it is accepted that an infinite set exists? 
This a major stumbling block for those not trained in mathematics and a “look it’s 
obvious” approach will not win any converts. And why should it? In an accurate 
treatment an infinite set is introduced by a set-building axiom. When set theory is 
naive, non-mathematicians can appear foolish, and mathematicians can appear 
doctrinaire. 

Every text of this kind has its red lines, introduced by the hackneyed phrase 
“beyond the scope of this book’. The word ‘fundamental’ of the title is supposed to 
be taken seriously and construed as meaning a certain portion of analysis. What bits 
of analysis are fundamental? They must include the following items: an accurate 
description of the real numbers, limits, infinite series, continuity, derivatives, 
integrals and the elementary transcendental functions. These are standard contents 
of a first university course in analysis. Very broadly, the boundaries of fundamental 
analysis lie where the key to further progress requires certain far-reaching theories 
that are introduced to students after a first course, typically, complex analysis, 
metric spaces, multivariate calculus or the Lebesgue integral. 

In this text there is no discussion of countability versus uncountability for sets. 
There are no open or closed sets (apart from intervals), and therefore no topology or 
metrics; and certainly no Heine—Borel theorem, though we go dangerously close to 
requiring it. This means that we stop short of a nice, necessary and sufficient 
condition for integrability. The integral is Riemann-Darboux; though I freely 
confess my view that the Lebesgue integral is the greatest advance in analysis of the 
twentieth century. There is no general treatment of any class of differential equa- 
tions; though some very special and important equations appear at crucial places in 
the narrative. There is no complex analysis (though there is a chapter introducing 
complex numbers) and almost no functions of several variables; certainly no final 
chapter, so beloved by authors of analysis texts, entitled ‘Extension to several 
variables’. Surely several variables deserve a book of their own. 

Missing is any construction of the real numbers or the complex numbers. My 
view is clear: neither construction is needed for analysis. They serve only two 
purposes: logically, to prove that the axioms of analysis are consistent; and peda- 
gogically, to answer students who obstinately want to know what the square root of 
two and the square root of minus one are in reality, and who are not necessarily 
convinced by the answers. Moreover, giving prominence to constructions tends to 
suggest that there is only one right way to understand real numbers or complex 
numbers. 
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After reading two paragraphs devoted to what is not included, the reader may 
well wonder what the author considers fundamental analysis to be. To see what is 
included the reader is referred to the rather thorough list of contents. 

Analysis is like the trunk of a great tree that gives rise to branches, some small, 
some large and some still growing. I have included a number of sections marked 
with the symbol (<>) and referred to as nuggets (as in nugget of wisdom, though I'm 
actively searching for a different name). These take up fascinating topics that can be 
explored using fundamental analysis, but can be omitted without losing the main 
thread. They go in some cases far beyond what a beginning student would ordi- 
narily encounter. They serve to enrich the narrative and often point to a whole 
subject area that springs out of the main trunk of the tree. They are not needed in the 
main text of sections not so marked; however, they may be needed for some of the 
exercises. Some push the boundaries of the main text and encroach on areas where 
one really starts to need complex analysis or multivariate calculus to make sig- 
nificant progress. Mostly they can be omitted in a first course of analysis. These 
sections, and exercises elsewhere that may need material from them, are marked 
with the nugget symbol (<). Most conclude with a short subsection called ‘Pointers 
to further study’ listing topics or whole subject areas that the reader can look up if 
they wish to pursue the topic of the nugget further. 


Advice for Instructors 


This text began life as lectures for a first course of analysis, which was taught a 
number of times to mathematics students in their first year at the University of 
Iceland, and consisted of 23 lectures of 80 min each. Material from all 12 chapters 
was covered in the lectures, in the same order of presentation, omitting the content 
of sections marked with the nugget symbol (<>). Some of the topics that ended up in 
the nuggets were assigned to students as study projects on which they were required 
to give a presentation. 

Thus, in spite of a considerable expansion, and because the additional and more 
demanding material is clearly marked, the text can be used as the basis for a course. 
The instructor would only have to agree with the author on a number of key 
pedagogical issues, that can give rise to heated disputes and to which the answer is a 
matter of personal preference. For example, over whether or not to construct the real 
numbers; or over whether to present sequences and series before functions of a real 
variable; or whether uniform convergence should be covered in a first course; or 
how to define the circular functions. The first issue is discussed in the nugget 
(Sect. 3.10) ‘Philosophical implications of decimals’ and elsewhere in this preface; 
the elementary transcendental functions are rigorously defined and studied at the 
earliest point in the text at which it is possible in a practical and meaningful manner. 

The text contains many exercises mostly collected together into exercise sec- 
tions. However, some isolated exercises interrupt the text with the purpose of 
inviting the reader to engage immediately and constructively with the material. It 


x Preface 


will be noticed that many of the exercises are challenging and some of them present 
results of independent interest. It is expected that the instructor can provide addi- 
tional routine exercises for the purpose of practising the basic rules. 

Scattered throughout the text are some pictures. The philosophy behind them is 
that they may help the reader to visualise an idea or a proof, but are never a 
necessary part of the discourse. They were hand-drawn using the free graphics 
software IPE and are intended to resemble nice impromptu sketches that a teacher 
might make in class. 


Reykjavik, Iceland Robert Magnus 
November 2019 Professor Emeritus 
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Chapter 1 M®) 
Introduction Betis 


1.1. What Is Mathematical Analysis? 


Mathematical analysis, or simply analysis, is the study of limits, series, functions 
of a real variable, calculus (differential calculus and integral calculus) and related 
topics, on a logical foundation and using methods that are considered acceptable by 
modern standards. The aims of analysis were attained by two main achievements: an 
exact formulation of the properties of the real numbers, and a correct definition of the 
notion of limit. Analysis is both very reliable, in that its conclusions are considered 
correct with considerable confidence, and very useful; modern science would be 
unthinkable without calculus, for example. 

John von Neumann wrote: “The calculus was the first achievement of modern 
mathematics and it is difficult to overestimate its importance. I think it defines more 
unequivocally than anything else the inception of modern mathematics; and the 
system of mathematical analysis, which is its logical development, still constitutes 
the greatest technical advance in exact thinking.” 


1.2 Milestones in the History of Analysis 


The dates in the following list are approximate. 

c. 300 BC Euclid publishes a proof that /2 is not a rational number (Elements, 
theorem 117, book 10). 

c. 300 BC Euclid publishes Eudoxus’ theory of irrational numbers (Elements, 
book 5). 

c. 250 BC Archimedes solves several problems, such as that of calculating the area 
of a parabolic segment, using methods that foreshadow integration. 

1660-1690 Newton and Leibniz invent calculus (differential and integral calculus). 
They base it on the idea of infinitesimals. These are quantities that are smaller than 
any positive real number, yet are still positive and non-zero. 
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2 1 Introduction 


1660-present The sensational solution of the Kepler Problem is only the first of an 
inexhaustible supply of problems in applied mathematics, science and technology 
that are solved using calculus. 

1734 George Berkeley criticises the foundations of calculus. He asks what infinitesi- 
mals really are, and writes: “May we not call them the ghosts of departed quantities?” 
1600-1800 Infinite series are used although a clear definition of convergence is lack- 
ing. Many exciting results are obtained by using infinite series as if they were finite 
sums (most noteworthy is the work of Euler). It is known even so that uncritical use 
of series can lead to contradictions, such as that 1 = 2. 

1817 Bolzano gives a definition of limit not based on infinitesimals. It attracts little 
attention. 

1820 Fourier introduces Fourier series which make it possible to represent a dis- 
continuous function as an infinite series of trigonometric functions. The problem is 
that there is still no clear concept of function, no clear concept of continuity or the 
convergence of a series. 

1828 Abel expresses the view that in all mathematics there is not a single infinite 
series whose convergence has been established by rigorous methods, and comments 
that the most important parts of mathematics lack a secure foundation. He gives the 
first proof establishing the sum of the binomial series that is recognisably correct. 
1820-1850 Cauchy takes on the task of placing calculus on a secure foundation. He 
uses some form of the modern ¢, 5 arguments but still frequently relies on infinitesi- 
mals. He invents complex analysis and proves Taylor’s theorem. In 1821 he publishes 
Cours d’Analyse, the forerunner of all modern books on mathematical analysis. He 
famously makes the mistake of claiming that the limit of a sequence of continuous 
functions is continuous. 

1872 Dedekind defines Dedekind sections. He carries out an exact study of the 
nature of the real numbers, the first since Eudoxus. The notion of irrational number 
is clarified. In this way a secure foundation is obtained for analysis. 

1860-1880 Weierstrass completes the work of Cauchy. The concept of uniform 
convergence is clarified. He creates famous counterexamples that show the dangers 
that lurk in uncritical thinking. One such is a continuous function that is nowhere 
differentiable. 

1860-1890 Dedekind and Cantor create set theory. It proves to be the correct lan- 
guage in which to express the conclusions of analysis. 


Chapter 2 ®) 
Real Numbers ices 


As professor in the Polytechnic School in Zurich I found 
myself for the first time obliged to lecture upon the 
elements of the differential calculus and felt, more keenly 
than ever before, the lack of a really scientific foundation 
for arithmetic. 


R. Dedekind. Essays on the theory of numbers 


2.1 Natural Numbers and Set Theory 


Natural numbers are used to count the members of finite sets. They are 0, 1, 2, 3 
and so on. They constitute a set denoted by N. We consider 0 a natural number; it is 
needed to count the members of the empty set. The set of positive natural numbers 
(that is 1, 2, 3, etc.) is denoted by N,. 

We cannot undertake an exact treatment of analysis without set theory. We assume 
the reader understands the following formulas concerning sets: 


(a) x ¢€A_ This says that x is an element of the set A. Its negation, the statement 
that x is not an element of A, is written x ¢ A. 

(b) ACB This says that the set A is a subset of the set B, that is, every element of 
A is an element of B. It does not preclude the possibility that A = B. 

(c) AUB The union of the sets A and B, that is, the set of elements x such that 
xeAorxe B. 

(d) AMB The intersection of the sets A and B, that is, the set of all elements x 
such that x € A andx € B. 

(e) @ The empty set, mentioned above as the set with 0 elements. 


If A C B we also say that B includes A, but never that B contains A. The latter 
would always mean that the set A is an element of the set B. In this text we shall 
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mention sets of sets only rarely. Principally, the sets we need have as their elements 
numbers of some kind, for example real numbers or natural numbers. 

It is worth reminding the reader that the logical disjunction of two propositions p 
and q, written symbolically p Vv q and read “p or q’”, is true when either proposition 
is true, including when both are. Thus A U B includes the set AN B. 

Union and intersection are simple ways to build new sets out of old. But we will 
need much more powerful ways which we will not try to justify. For example we 
need the natural numbers to form a set. Only if this is readily believable can we 
proceed to analysis. It is a common experience of mathematicians that attempts to 
explain analysis to someone without mathematical training get stranded on this point. 
The interlocutor does not accept the existence of an infinite set, although they may 
readily accept the fact that the natural numbers are infinitely many, apprehending 
that there is an unlimited supply of them. They may find it hard to admit that there 
is any sense in treating the natural numbers as a totality, as a completed whole. This 
is not foolish; that the natural numbers form a set cannot be proved; it requires an 
axiom of set theory (axiom of infinity). Without this fact there is no analysis. We 
will see ample confirmation of this, in the great emphasis placed on sequences for 
example. 

We can also form the intersection and union of infinitely many sets (the latter 
requires an axiom in proper accounts of set theory). We will look at these construc- 
tions, and other operations on sets, if and when they are needed, but they are readily 
acceptable as common sense once the notion of an infinite set is accepted. 

In simple cases we can list the elements of a set, enclosing them in curly brackets. 
For example 

A = {0, 1, 4, 16} 


builds a set whose elements are 0, 1, 4 and 16. The order of the elements does not 
count, nor do repetitions. Thus the sets 


{1,2}, {2,1}, {2, 1,2} 


are all equal to each other. The reason is that two sets A and B are defined to be equal 
when they have the same members, that is, when x € A if and only if x € B. 
We might try to build the set of all even numbers by writing 


B = {0, 2, 4, 6, 8, ...} 
and such formulations are readily understandable if used sensibly. However, the set of 
all even numbers is more correctly formed using a prescription called specification, 
that builds the set of all elements of a given set that have a specified property. Using 
the property “n is divisible by 2”, we can form the set B of even numbers by 


B= {n EN: nis divisible by 2}. 


The general form of this construction is 
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B={xeE: P(x)}. 


Here FE is a given set and P(x) is a property (called by logicians a predicate) which 
may or may not be true for different assignments of elements of E to the variable x. 
So we are singling out the set of all elements x of E for which P(x) is true. 

Using this we can build lots of subsets of the set N of natural numbers; for example, 
the set of even numbers, odd numbers, prime numbers, square numbers and so on. 
But we never get anywhere near all the possible subsets of N by this means because 
we cannot form enough predicates. We may want to make a statement about all 
subsets of N. These form the elements of a set so vast that it defeats our imagination 
to encompass it (though that is nothing compared to the really big sets of set theory). 
Although we will not need it in this text, the set of all subsets of N is an important 
set for analysis, so it had better exist (again it requires a special axiom of set theory, 
the power-set axiom). 

We now state an important property of the natural numbers. We shall consider it 
an axiom. It refers to completely arbitrary subsets of N. 


Principle of induction. If A is a set of natural numbers (that is, A C N) such that 
0 € A, and such that x + 1 € A whenever x € A, then A = N. 


Axiomatic set theory usually has an axiom that makes infinite sets possible (the 
axiom of infinity mentioned earlier). It posits the existence of a set that contains 0, 
and for all natural numbers x, it contains x + 1 if it contains x. From this one can 
construct the set N by defining it as the intersection of all sets that have the property 
stated in the previous sentence. If one accepts this definition of N, then the principle 
of induction follows as a theorem about N. However, it is also quite natural to take 
the existence of N as a set for granted, and the principle of induction as an axiom 
that singles out the essential nature of the natural numbers: that you can reach any 
natural number from 0 by successively adding 1; and anything that you reach in this 
way is a natural number. 

Another version of the induction principle, one that has more practical value, uses 
predicates that can be applied to the natural numbers, instead of subsets. An example 
of a predicate could be “n is divisible by 5”. 


Principle of induction with predicates. Let P(n) be a predicate applicable to the 
natural numbers (that is, it is true or false for each substitution of a natural number 
for n). Assume that P (0) is true, and that for every n it is the case that P(n) implies 
P(n+ 1). Then P(n) is true for all natural numbers n. 


The principles are equivalent. Suppose that we assume the principle of induction. 
Let P(n) be a predicate with the properties that P(0) is true and that P(n) implies 
P(n+ 1). We form the set A = {n € N : P(n)} and see that 0 € A, and, for every 
x, if x € A then x + 1 € A. We deduce by the principle of induction that A = N, 
that is, that P(7) is true for all , thus establishing the principle of induction with 
predicates. 

Conversely, let us assume the principle of induction with predicates. Let A C N. 
We apply the principle of induction with predicates to the predicate “n € A” to obtain 
the principle of induction. 
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Instead of assuming that P(O) is true we could assume that there is a natural 
number k such that P(k) is true. Then the conclusion would be that P(n) is true for 
all natural numbers n > k. 

Our first proposition says something immensely important about all possible sub- 
sets of N, in their unimaginable variety. 


Proposition 2.1 Every non-empty subset of N contains a lowest element. 


More informally: in every collection of natural numbers there is a smallest number. 
This result is often used without being explicitly mentioned; and the present work is 
probably no exception. 


Proof Given the non-empty subset A of N, let B be the set 
B= {x E€N:x <y forall y € A}. 


The specification says that B is the set of all natural numbers x, with the property 
that x < y for all y in A. Elements of B are called lower bounds for A. 

Now 0 € B. On the other hand B is not all of N, since there exists some z € A 
(as A is not empty) and then z + | ¢ B. Now we turn the induction principle on its 
head and conclude that there exists u in B such that u + 1 is not in B (for otherwise 
B would be all of N). So u is a lower bound for A but u + 1 is not. 

Now u must be in A. For u < y for all y € A, so that if u is not in A we would 
have u < y for all y in A. Since we are only considering integers we would have 
u+1<_y for all y in A and u+ 1 would be a lower bound—which it is not. We 
conclude that u € A and hence uw is the lowest element of A. 


2.1.1 Exercises 


1. Show that the following predicates are true for all natural numbers: 


(a) Either n is divisible by 2 or n + 1 is divisible by 2. 
(b) 2" >n. 


2. (a) Prove that 2” > n? whenever n is a natural number greater than or equal to 5. 
(b) Prove that 2” > n> whenever n is a natural number greater than or equal to 10. 
3. Let a be a natural number. Prove the following by induction: 


(a) For all natural numbers 1, the number (a + 1)” — 1 is divisible by a. 
(b) For all even natural numbers 1, the number (a — 1)” — 1 is divisible by a. 


Note. This follows most naturally from the binomial theorem, which will be the topic of several 
exercises in the coming pages. 

4. Prove that the following version of the induction principle, which appears to have 
a weaker premise than the usual one, follows from the usual induction principle: 
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Let P(n) bea predicate for the natural numbers. Assume that P (0) is true, 
and that whenever P(k) is jointly true for all of k = 0,1, 2,...,” then 
P(n + 1) is true. Then P(n) is true for all n. 


Hint. Consider the predicate Q(n) that says that P (k) is jointly true for all natural 
numbers k such that 0 < k <n. 
Note. Use of this rule is called proof by complete induction. 

5. The locus classicus of proof by complete induction. Show that every natural 
number greater than or equal to 2 is divisible by some prime number. 

6. The Fermat numbers are defined by the formula a, = 2?" + 1, forn = 0, 1, 2..... 


(a) Show that if n ~ m then the numbers a, and a,, have no common prime 
divisor. In short, they are coprime. Note that 1 is not considered a prime. 
(b) Deduce that the set of primes is infinite. 


Note. Euclid published a proof that the set of primes is infinite that the reader is probably more 
familiar with. 


2.2 Axioms for the Real Numbers 


We shall describe the set R of real numbers by axioms, listing the properties that it 
should have from which we are confident that it is possible to derive the whole of 
analysis. We shall not attempt to build such a set, although this can be done using 
set theory. Intuitively, the real numbers model a line in Euclidean geometry, or more 
precisely, a coordinate line, like the x-axis of coordinate geometry standing alone. 
This picture is the reason why we often refer to a real number as a point. The basic 
intuition is that the line is marked off by a selection of real numbers for the purpose 
of measurement, and most importantly, any degree of accuracy can be attained by 
increasing the density of markings. 

We shall not make any essential use of this picture. Instead we set out properties 
of the real numbers in the form of axioms. These fall into three distinct groups. We 
will introduce them in stages interspersed with deductions of familiar rules, together 
with reasons why further axioms are needed. 


2.2.1 Arithmetic Axioms 


The first group of axioms concerns arithmetic. 


Axioms A. Ris a field. 


These axioms specify the algebraic operations we can perform with real numbers, 
together with their properties. There are two binary operations, x + y (addition) and 
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x - y (multiplication), and two distinct constants 0 and 1, that satisfy the six axioms 
in the following list: 


(Al) (Commutative laws) For all x and y we have 
x+y=y+x and x-y=y-x. 
(A2) (Associative laws) For all x, y and z we have 
(x«ty)+z=x+(y+z) and (-y)-z=x-(y-2). 
(A3) (Neutral elements) For all x we have 
x+O0=x and x-l=x. 

(A4) (Additive inverses) For all x there exists y, such that 

x+y= 0. 
(A5) (Multiplicative inverses) For all x not equal to 0 there exists y, such that 

x-y=l. 
(A6) (Distributive law) For all x, y and z we have 

x-(ytz=(@-y)+@-2z). 
Later we shall identify 0 and 1 with the natural numbers 0 and 1. For the moment 

it seems possible that they could have properties quite unlike 0 and 1. This is the 


reason for placing bars over them. 
From axioms Al—A5 we can derive some common algebraic rules: 


(i) (Cancellation in sums) Ifx+y=x+zthen y = Zs 
(ii) (Cancellation in products) Ifx-y=x-zandx #0 then y = z. 


It follows from the cancellation rules that, given an element x, an element y that 
satisfies x + y = 0 is uniquely determined by x. It therefore makes sense to denote it 
by —x. Similarly an element y that satisfies x - y = 1, given that x 4 0, is uniquely 


determined by x. We denote it by x~!. 


Exercise Prove the cancellation rules from the axioms. 


Further familiar rules can be derived with the help of axiom A6 also: 


(iii) (Multiplication by 0) For all x we have x - 0 = 0. 
(iv) (Multiplication by —1) For all x we have (—1)- x = —x. 
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Exercise Prove the stated rules from the axioms. 


Now we define subtraction by 
x—-—yi=x+(-y), 


and division by 


given of course that y 4 0. As is usual the colon in the equation signifies that the 
right-hand side is the definition of the left-hand side, although we are not overly 
insistent over its use. 

The sum and product of n real numbers, 


Xp tx2+-+- +X, Xp + X2+ Xp 


are well defined, without the need for brackets, because of the associative laws. 
Strictly speaking this should be proved. The proof, using induction and a good helping 
of patience, is surprisingly long. We will simply accept these facts. 

Let n be a positive natural number and x an element of R. We define 


aA n factors 
Xo TS XN X 


n summands 
NX t= X+-: 


Up to now there is no way to prove that 21 (that is, 1 + 1) is not equal to 0. A set 
of elements that satisfy axioms Al—A6 is called a field. There are many examples 
of fields that are nothing like the real numbers. For example there exists a field with 
only 7 elements. In such a field we must have 71 = 0. 

Real numbers must therefore possess some other defining properties in addition 
to axioms Al—A6. 


2.2.2 Axioms of Ordering 


The second group of axioms concerns the ordering properties of the real numbers. 


Axioms B. Ris an ordered field. 


There exists an order relation that can apply to pairs of elements of R, written (when 
applicable) x < y. This relation satisfies the following axioms: 
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(B1) (Trichotomy) For each x and y exactly one of the following three possibilities 
must hold: 
Katy; x<y, y<x. 


(B2) (Transitivity) For all x, y, z,ifx < y and y < zthenx <z. 
(B3) For all x, y,z,ifx < ythenx+z<ytz. 
(B4) For all x, y, z,if0 < zandx < ythenx-z<y-z. 


Note that axioms B3 and B4 relate ordering and algebraic properties. 
We define further relations: 


x > y means y < x, 
x < y means (x < yorx=y), 
x > y means (x > yorx=y). 


We define the concepts of positive and negative. If x > 0 we say that x is positive. 
If x < 0 we say that x is negative. 
The following familiar rules are consequences of the axioms: 


(i) x is positive if and only if —x is negative. 
(ii) x > y if and only if x — y is positive. 
(iii) 1 is positive. 
(iv) If x is positive then x! is positive. 
(v) For all x not equal to 0 the number x? is positive. 
(vi) Ifx <yandz<Othenx-z>y-z. 


We shall give the proof of rule (iii), leaving the others as exercises. 


Proof of Rule (iii) We note that by trichotomy either 0 < 1 or I < 0 (since equality 
is ruled out by the assumption of distinctness in axioms A). Assume if possible that 
1 < 0. Then —1 is positive (by rule (i), which we suppose was already proved) and 
we have (—1)- 1 < (—1)-0 by axiom B4, that is =e 0, leading to 0 < 1. This 
contradicts the assumption 1 < 0 and proves rule (iii). 


Exercise Prove the remaining rules. 


An ordered field includes a copy of N. Because 1 > 0 we have, for each x, that 
x <x +1. This leads to a strictly increasing sequence 0, 1, 21, 31, ..., that is, 


021221231 es 2nl 2G) Se 


We can never reach 0, because if we did, we could conclude that 0 < 0, which is 
impossible. By the same argument any two terms in the sequence are distinct. 

The set {nl :néEN,}U {0}, that is to say all elements nl where n = 1, 2,..., 
together with 0, is therefore a copy of the natural numbers included in R. So we 
can identify n1 with the natural number n. From now on we write 0 instead of 0, 1 
instead of 1, 2 instead of 21 and so on. We will not distinguish between the natural 
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number n and the real number n. We will also usually omit the dot in the product 
xX - y, writing it instead as xy. 

This is beginning to look like the familiar coordinate line. Next we must fill the 
gaps between the integers. 


2.2.3. Integers and Rationals 


Elements of the set N U —N, that is, of 
{0, 1, 2,3, ...} U {-1, —2, —3, ...}, 


are called integers. The set of integers is denoted by Z. 
Elements of the set rs 
Qi [=:meZ, neN,| 
n 


are called rational numbers. The (first) colon signifies as usual that the right-hand 
member is the definition of the left-hand member. The notation mixes specification 
and the listing of a set’s elements within curly brackets. A more correct, but less 
transparent formulation is 


{x € R: there exist integers m, andn > 0, such that x = m/n}. 


A remarkable fact emerges: Q is an ordered field. The reader should check that 
all the axioms A and B are satisfied by Q. It is virtually enough to show that the sum, 
product, difference and quotient of two rational numbers are rational. An even more 
remarkable conclusion follows: 


There is no way to show that R contains elements other than those in Q, by 
means of the axioms A and B alone. 


2.2.4 Qis Insufficient for Analysis 


It was an early discovery that Euclidean geometry is inconsistent with the assumption 
that all segments have rational length. It was found that the diameter of a unit square 
is not rational. So Q is not enough for practical tasks like drawing up plans for a new 
kitchen. The following proposition is in Euclid’s Elements. 


Proposition 2.2. The real number 2 has no square root in Q. In other words, there 
does not exist in Q any number x that satisfies x = 2. 


Proof We are going to use an elementary fact of number theory, that any two natural 
numbers have a highest common divisor. This implies that a positive element of Q 
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may be written as m/n, where m and n are positive integers with highest common 
divisor 1. 

Suppose, if possible, that 2 has a rational square root. We express it in the form 
m/n, where m and n are positive integers with highest common divisor |. Then 
m?/n* = 2, so that m? = 2n*. Then 2 divides m7, and therefore also m, as is easy to 
see. Hence we can write m = 2k where k is anatural number. We find that 4k? = 2n?, 
so that 2k? = n*. By the same argument 2 divides n also. This is a contradiction 
because m and n have highest common divisor 1. 


This conclusion is a special case of a theorem of Gauss on polynomial equations, 
usually called Gauss’s lemma. See the exercises in this section. 

A real number that is not rational is called an irrational number. By using only 
the axioms for an ordered field it is impossible to show that any irrational number 
exists, for the simple reason that Q satisfies all the axioms for an ordered field. As 
we saw, any number that is a square root of 2 must be irrational. 

It would be inconvenient if no square root of 2 existed. We would not be able to 
assign a length, as a real number, to the diagonal of a unit square. So R is not merely 
an ordered field. We need a new axiom to ensure that irrational numbers, such as the 
square root of 2, exist. To express this axiom we need to look at subsets of R. 


2.2.5 Dedekind Sections 


The set of all subsets of R is dizzyingly large, but it seems to enter analysis in an 
essential way. The existence of this set is guaranteed by an axiom of set theory, the 
power set axiom. For fundamental analysis we can mostly get by without needing the 
set of all subsets of R, since many important subsets can be defined by specification. 
To keep things simpler we first look at certain subsets of R with a simple structure. 


Definition A Dedekind section of R is a partition of R into two disjoint sets, D; and 
D, (left set and right set), such that R = D; U D,, neither D; nor D, is empty, and 
for all x € D; and y € D, we have x < y. 


Recall that a set can be empty. We are explicitly excluding that D; or D, can be the 
empty set 4. The definition requires that D; and D, have no common elements. This 
is expressed by the formula D; N D, = @, or in words: D; and D, are disjoint. The 
syntax is conventional but confusing, as this is not a property of the sets individually 
but a property of the pair of sets. 

In order to give an example of a Dedekind section it is convenient to use a new 
binary operation on sets, the set difference A \ B. This is the set of all elements of 
A that are not elements of B. In the case that A = R and B C R, it is usual to call 
A \ B the complement of the set B. 
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Example of a Dedekind Section Let D; be the set of all positive real numbers x 
that satisfy x? < 2, together with all negative real numbers and 0. Let D, = R \ Dy. 


Exercise Check that this defines a Dedekind section. 


2.2.6 Axiom of Completeness 


The final axiom for the real numbers is expressed in terms of Dedekind sections. 


Axioms C. R is a complete ordered field. 


There is one additional axiom that turns an ordered field into a complete ordered 
field: 


(C1) For each Dedekind section R = D, U D, there exists a real number ft, such 
that x < ¢ forall x € D; andt < y forall y € D,. 


We know that tf is either in D; or in D,, but not in both. That means that either t 
is the highest element of D; or it is the lowest element of D,. 


Exercise Show that ¢ is uniquely defined by the Dedekind section. 


Often the definition of Dedekind section includes a stipulation that D; has no 
highest element. This is to ensure that there is a one-to-one correspondence between 
Dedekind sections and real numbers. 


2.2.7 Square Root of 2 


The trouble began with 2. We can settle the issue straight off. 
Consider the Dedekind section that we defined before: 


D, ={x €R:x <O}U{x ER: x>0, x? <2} 
D, ={x €R:x>0, x7 > 2}. 


According to axiom Cl there exists ¢, that is either the highest element of D; or 
the lowest element of D,. It is clear that t > 0, this being obvious if t is the lowest 
element of D,, and if t is the highest element of D; then t > 1 since 1 € D). 

We intend to show that 17 = 2 by excluding the possibilities 7 < 2 and f* > 2. 
Then axiom B1 gives t? = 2. We show in detail that t? < 2 is impossible. The reader 
is asked to give the details for excluding t? > 2 (Exercise | below). 

Suppose that t* < 2. Then ¢ is in Dy, and is therefore its highest element. We shall 
produce a contradiction by exhibiting a real number z such that z > ¢ but z? < 2. 
Such a number z would be an element of D; higher than r. The tricky thing is that 
we have to do this without assuming that the number »/2 exists. 
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The least cunning approach is to let z = t + ¢ where ¢ > 0. We want if possible 
to choose ¢ so that (t + ¢)* < 2, and this is equivalent to 


?4+2et +e? <2, 


which in turn is equivalent to 
2-1 


E< ; 
2t+eé 


Let us agree to look for ¢ in the interval 0 < ¢ < 1. For such € we have 


2-7 2-f 
> ‘ 
2t+e 2t+1 


Therefore it suffices if ¢ also satisfies 


will work, where min(a, b) denotes the lower of the two numbers a and b. The 
number € is positive and quite explicitly defined. 


2.2.8 Exercises 


1. Exclude the possibility that t? > 2 in the above argument for the existence of /2 
using a similar approach. This concludes the proof that ./2 exists. 

2. Suppose that tf > 0 and let s = (2t + 2)/(t + 2). Show that if t? <2,thent <s 
and s2 < 2; moreover if t? > 2, then t > s and s? > 2. This gives another, and 
purely algebraic way, to prove that the number ¢ of the previous section is V2. 

3. We take the idea of the previous exercise a step further. Let t; = 1 and for n = 
1, 2,3, ... we set 

2t, +2 


ek. 


fn+1 = 


Calculate ¢,, using a calculator up to f7 (or further if you have the patience). The 
results suggest that f, approaches /2 with increasing n and could be used to 
approximate \/2 by rational numbers. The germ of the limit concept is apparent 
here. 
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4. Prove that if n is a positive integer, and x and y are real numbers satisfying 
0 <x < y, then x” < y”. This shows that the function x” is strictly increasing 
on the domain of all positive real numbers. 

5. Show that in any ordered field the inequality x + x~! > 2 holds for all positive x. 

6. Let F bea field, that is, F is a set with operations, and constants O and 1, that satisfy 
the axioms A1l—A6. Let P be a subset of F that has the following properties: 


(a) O is notin P. 

(b) If x and y are in P then x + y isin P. 

(c) If x and y arein P then x - y isin P. 

(d) IfxF 0 then either x is in P or —x is in P. 


Define arelation in F,, writtenx < y,as follows: x < yshallmeanthat y — x € P. 
Show that this is an order relation that makes F into an ordered field for which 
P is the set of all positive elements. 


2.2.9 The Functions Max, Min, and Absolute Value 


The functions min (used in the last section) and max are examples of functions of 
two real variables. They are defined by 


; a, ifa<b 
min(a,6) == {5 eh Sa 

a, ifa>b 
max(a, b) ae PRS a: 


The absolute value of x, denoted by |x|, is defined by 


t= x, ifx>0 
“~ )—x, ifx <0. 
Some useful rules for manipulating absolute value and the functions max and min 
are 


G@) — |ab| = |al|5| 
(ii) |a+b| < |a| + |b| (triangle inequality for real numbers) 
(iii) |a — b| > [lal — [dl 
(iv) max(a,b) = $(a+b+|a—D)) 
(v) min(a,b) = 3(a+b-—|a—Dd)). 
Often the correct approach when dealing with absolute values, or max and min, 


is to consider cases. For example, if a > 0 then |a| = a; if a < 0 then |a| = —a; if 
a > b then max(a, b) = a; if a < b then max(a, b) = b and so on. 
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Rule (ii) can be proved by cases but a more elegant approach is available. It begins 
by noting that a < |a|. 


Exercise Prove rules (i—v). 


The functions max(a, b) and min(a, b) can be extended by induction to any finite 
non-zero number of real variables. Thus for n = 2, 3, 4,5, ... and so on, we set 


max(x1, X2, ..., Xn41) = max (max(x1, X2, «0+, Xn)s Xn41) 
min(x1, X2, ..., X41) = min (min(xy, XQ, +05Xn); Xnt1)- 


Defining a sequence of objects by induction means that the n"" object is defined 
in terms of the preceding ones and we shall see more examples of this in the next 
chapter. 

The sum of n numbers and the product of n numbers are also, strictly speaking, 
defined by induction, though we drew no attention to this when introducing them. 
Thus correctly, the definitions for all natural numbers n of the sum x; + --- + x, and 
the product x;...x, should be given by a scheme which could be written as 


Xp + Xp = pH + Xp) + Xn41; (n = 1) 
Xp Xn = (XM) Xng1, (n= 1) 


or in a more formal notation that avoids the dots: 


n+l n 


Vix = Vox +xng1, (X= 1) 
k=1 k=1 


Given a sequence of numbers x1, x2, x3 and so on, the expression ban xX, intro- 
duced here denotes the sum of all the numbers x, as k runs from 1 up to, and 
including, n. Similarly, [];_, denotes the product of the numbers x, as k runs from 
1 up to, and including n. 

The notation for sums and products, doubtless familiar to the reader, is capable of 
some flexibility. For example the expression }-7_,,, Xx denotes the sum of all numbers 
x, as k runs from m up to n, inclusive. This would normally be used in cases where 
m <n. If this is not known beforehand we can still use the expression but interpret it 
to be O0ifm > n. In short, the sum of an empty set of numbers is 0. The corresponding 
convention for product is that the product of an empty set of numbers is 1. 

Given the n numbers x1, ..., x, it should be obvious that max(x), ..., X,) picks out 
their maximum, whilst min(x,, ..., X,) picks out their minimum. In fact the inductive 
definition even gives us a nice algorithm for computing these values. This makes it 
obvious that these quantities do not depend on the ordering of the numbers. We can 
even define max(x) = x = min(x), thus defining these quantities for one variable. 
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The rather obvious fact that, of a finite set of real numbers, there is one that is 
highest and one that is lowest is often very important; and it is equally important that 
of an infinite set of numbers there may be no highest, or no lowest. 


2.2.10 Mathematical Analysis 


Mathematical analysis is the branch of mathematics that is based on the axiom of 
completeness of the real numbers, axiom C1. Most common functions, including 
those used in school mathematics, can only be adequately defined using mathematical 
analysis. A thorough treatment of functions and their role in analysis really starts in 
Chap. 4, but we give here a derivation of the n" root function, where n is a natural 
number, and more generally, the function x“, where a is a rational exponent. The 
basic role of the completeness axiom will be apparent. 

We saw that by means of axiom C1 we could prove that the equation x7 = 2 has 
a positive root; and that without it, or more precisely, on the basis of axioms A and 
B alone, we could not. It is easy to see that there is only one positive root. Indeed we 
can observe that the function x? is strictly increasing for positive values of x. So if 
it takes the value 2 it can do so only once for a positive x. 

We can use the same logic to show that the equation x* = y has a unique positive 
root for each positive real number y. We can therefore define the function ,/y, the 
square root of y, as follows: for each positive real number y the number ,/y is the 
unique positive root of x? = y. 

In the same way we can define the n" root of y (where n is a natural number 
greater than 1 and y > 0) as the unique positive real number x that satisfies x” = y. 
One can imitate the method used to show that /2 exists, defining the left-hand set 
of a Dedekind section as the set of all real numbers x such that x > 0 and x” < y, 
together with all non-positive numbers. The binomial theorem is useful to finish the 
argument (see Exercises 11, 12 and 13). 

Later in this text, in Chap.4, an argument of a more general nature will be pre- 
sented, based on continuity and deploying a powerful result called the intermediate 
value theorem. This is preferable to the rather clumsy approach using Dedekind sec- 
tions. However it is done, the outcome is a function ./y that produces the positive 
n' root of the positive number y. 

It now turns out that wJy* depends only on the ratio £/k (see the exercises). We 
denote it by y’/*, thus defining a fractional power of y. The scheme is extended 
to negative powers by setting y~“ = (y*)~!. And to the zeroth power y? = 1. In 
this way we can define y“ for rational a and positive y. Although certain powers of 
negative numbers make sense, for example (—2)!/?, it is safest to assume that y > 0. 
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The laws of exponents (such as y“+? = y“y”) are satisfied, and one can piece 


together a cumbersome proof of them (Exercise 19). It may be thought better to wait 
until arbitrary real powers have been defined (Chap. 7) and give a nice, smooth proof 
of these. 


2.2.11 Exercises (cont’d)! 


7. 


8. 


9. 


10. 


Prove the distributive rules for max and min: 
max(x, min(y, z)) = min(max(x, y), max(x, z)), 
min(x, max(y, z)) = max(min(x, y), min(x, z)). 


Let a and b be rational numbers such that ./a + Vb is rational. Show that ./a 
and /b are both rational. 

The Fibonacci numbers are the sequence of integers 1, 1, 2, 3, 5, 8, 13, ... and 
so on, in which each integer from 2 onwards is the sum of the two that precede 
it. Sequences will be studied thoroughly in Chap. 3. The sequence of Fibonacci 
numbers satisfies the recurrence relations 


An+2 = An+1 +a,, n= 1, 2, 3: tee (2.1) 


There are many different sequences that satisfy these relations. The Fibonacci 
numbers are distinguished by the initial conditions a; = 1, a2 = 1. The purpose 
of this exercise is to develop a formula for the n'* Fibonacci number. You will 
see that it requires the use of irrational numbers. You can use the following steps: 


(a) Show that there exist exactly two distinct real numbers, 7; and rz, such that if 
=r; Or A = ro then the sequence a, = 2X”, (n = 1, 2, 3,4, ...) satisfies the 
recurrence relations (2.1). Find r; and rz and show that they are irrational. 

(b) Show that further solutions of the recurrence relations can be obtained by 
setting a, = Ar} + Brz where A and B are real constants. 

(c) The Fibonacci numbers form the uniquely defined sequence of natural num- 
bers that satisfy the recurrence relations together with the initial values 
a, = az = 1. Find A and B such that the solution a, = Ar} + Bry satisfies 
a, = a2 = 1, thus producing a formula for the n™ Fibonacci number. 


The binomial coefficients (1) where n and k are natural numbers and 0 < k <n 


are defined as 
n\ _ n! 
kk) kW — kD 


'This is the second group of exercises in Sect. 2.2. For this reason the numbering is continued from 
the previous set. 
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11. 


12. 


13. 


14. 


15. 


16. 


The exclamation mark used here is the factorial symbol, doubtlessly familiar to 
the reader. For completeness we recall that given a positive integer n we denote 
by n! the product of all positive integers less than or equal to n. We define 0! to 
be 1. 

Prove the addition formula: 


(Cae ata: lsken. 


Note. The formula to be proved is the basis of Pascal’s triangle, a nice way to compute the 
binomial coefficients, and an obvious indication that they are all integers. 
Prove the binomial rule: for all natural numbers n and real a and b we have 


(a+by"=)> (arto 


k=0 


Hint. Induction is one possibility, making use of the previous exercise. 
Suppose that y > 0, x > 0 and x” < y, where n is a positive integer. Without 
using the existence of the n" root of y, show that there exists z, such that z > x 
and z” < y. 

Hint. You can let z = x + € and imitate the argument that established the exis- 
tence of /2. The binomial rule can be useful. The result is essentially a proof 
that .”/y exists. See the next exercise. 

Let y > 0. Write out an argument for the existence of ./y, where n is a positive 
integer, based on the Dedekind section of R whose left set is the set 


D, = {x €R:x <O}U{x ER: x>0, x” < y}. 


Prove that if the square root of a natural number n is rational then the square 
root is a natural number. 

Hint. You will need a theorem of number theory: if a is a natural number greater 
than 1 and a prime p divides a” then p divides a. 

Prove Gauss’s lemma. If a polynomial with leading coefficient 1 


Pe gal eae oe ee 


has integer coefficients, then every rational root is an integer. 

Hint. If a is an integer and a prime p divides a” then p divides a. 

Note. This generalises the result of the previous exercise. It also leads to a simple algorithm 
for finding all the rational roots of such a polynomial: if co 4 0 test all integral divisors of co 
for roothood. 

Prove the formulas: 


(a) Sk =4ntd 
(b) yey kh? = gn + 1)Qn +1) 
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18. 


19. 


20. 
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(c) tet ko = gn? (n + 1)? 
(d) dy, Qk—1) =n’. 


Prove the Cauchy—Schwarz inequality. Let a, and by be real numbers for k = 


1,2, ...,n. Then 
Se andi < (D4) (Da). 
k=1 


k=1 k=1 


Equality holds if and only if the two n-vectors (qd), ..., d,) and (bj, ...,b,) are 
linearly dependent; putting it differently, equality holds if and only if either 
b;, = 0 for all k or else there exists ¢ such that a, = tb; for all k. 
Hint. Let P(t) = \°y_, (ae + tb,)” and note that P(t) > 0 forall, whilst, unless 
all the by are 0, P(t) is a second-degree polynomial. Recall some school algebra 
about second-degree polynomials that never take negative values. 
Show that sy! depends only on the ratio £/k. Here we assume that y is a positive 
real number, and that k and / are positive integers. 
Let x > 0, y > 0 and let a and b be rational numbers. Prove the three laws of 
exponents 

york = yty?, (v7)? = ye. xy" = (xy)*. 
Hint. Planning is everything. First prove the laws in the case that a, b and c are 
natural numbers. That’s just a matter of counting factors and using the associative 
and commutative laws. Next do the case when the powers are the reciprocals of 
natural numbers, after that the case of positive rational powers, finally rational 
powers; or else wait until arbitrary powers, possibly irrational, have been defined 
in Chap. 7. 
Prove the following useful properties of rational powers: 


(a) Ifa > OandO < x < 1 then0 < x% <1. 
(b) Ifa > Oand x > 1 then x“ > 1. 

(c) Ifa > 1 andO < x < 1 thenx’ <x. 

(d) Ifa > 1 andx > 1 then x’ > x. 

(ce) If0 <a < 1land0O <x <1 thenx’ > x. 
(f) If0 <a <1landx > 1 thenx” <x. 


2.3. Decimal Fractions 


Every positive real number has a representation typified by 


x = 1001.3835104779... 
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where the digits to the right of the point, taken from the numbers 0, 1, 2, 3, 4, 5, 6, 
7, 8, 9, continue indefinitely. Conversely, every expression of this kind defines a real 
number. Negative numbers are included by writing a minus sign at the front. 

These facts are consequences of axiom C1. The concept of limit is needed to make 
the nature of the decimal representation precise. The reader is certainly familiar with 
decimal fractions” and they do provide a valuable basis for one’s intuition about 
real numbers. The following discussion, which logically should come later, uses 
the notion of an infinite series, to be fully explained in a subsequent chapter. The 
purpose for giving it here is to forge an immediate link between the real numbers and 
objects of the reader’s experience. The subject of decimals will be taken up again 
after infinite series have been properly introduced. 

A repeating decimal (also called a recurring decimal) is one of the form (to give 
an explanatory example by way of definition) 


x = 1001.3835104779477947794779... 


the recurrence being that the string 4779 is supposed to repeat indefinitely. We specify 
this by the notation 
x = 1001.3835104779. 


A terminating decimal is one of the form 
x = 1001.383510 


and is written more simply 
x = 1001.38351. 


2.3.1 Practical and Theoretical Meaning of Decimals 


Practically, decimals are invaluable. They make it possible to calculate using real 
numbers. Theoretically, a decimal is a sequence of rational approximations to a real 
number, that are of the form a/10”, where a and m are integers. As an example 
consider 

V2 = 1.4142135623... 


where more digits can be found by a simple algorithm which may be repeated indef- 
initely. It is an example of an infinite series 


Tee ee ee ee 
7 10 102 103-104-105 106 ~~ s-107 108 ~~ —:10° 


? Often the term “decimal fraction” is used to mean a rational number whose denominator is a power 
of 10. We use the term to mean a real number between 0 and 1 in its decimal representation. 
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Truncating to a finite number of terms gives a rational approximation to /2. A 
commonly used approximation to V2 is 1.414 = 1414/1000, obtained by truncating 
the series after the fourth term. 

The rational numbers stand out in this scheme. They are the repeating decimals. 
Examples are 


1 — 
— = 0.0625 = 0.06250 
16 


3 a 
=~ = 0.05769230. 
52 


2.3.2 Algorithm for Decimals 


Let x be a positive real number. The following algorithm produces the decimal 
representation of x. 

First we find the highest natural number less than or equal to x. The existence of 
this natural number is a consequence of axiom C1 as we shall see in a later section. 
Write this natural number in the decimal system (we assume that the procedure for 
this is known) and place a decimal point after it, for example 


1001. 


Subtract this from x. There remains a number x, in the interval 0 < x; < 1. We call 
this the first remainder. Now we have 0 < 10x; < 10. Let d, be the integer part of 
10x,, that is, the highest integer that is less than or equal to 10x,. It is one of the 
numbers 0, 1, 2, 3, 4, 5, 6, 7, 8, 9. Call it the first decimal digit and write it after the 
decimal point, thus 

1001.3 


Subtract it from 10x,. This leaves the second remainder x2 = 10x, — dj, also in the 
interval 0 < x2 < 1. Next form 10xz, then its integer part, called the second decimal 
digit dz, and the third remainder x3 = 10x. — d and so on. 

In each step the remainder determines the next decimal digit and the next remain- 
der, according to the scheme 


dy = [10Xn],  Xn41 = LOX, — dy. 
In this notation [y] denotes in general the highest integer (in this case it must be a 


natural number) less than or equal to y. The algorithm ends if the remainder is 0 at 
some step. 
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2.3.3 Decimal Representation of Rational Numbers 


Why is the decimal representation of a rational number a repeating decimal? 

Ifx = m/n (with positive integers m and n) the remainder at each step is a rational 
number of the form a/n in the interval 0 < a/n < 1. But there are only n of these, 
namely 


When the n + Ist remainder is reached two of the remainders that have already 
been calculated must be equal, for example x, = x;. The decimal digits that are 
determined by xx, ...., xy; then repeat themselves. 

If the decimal representation thus obtained is not terminating, the remainders are 
never 0. The length of the repeating string, called the period, is therefore at most 
n — 1; similarly the length of the initial string (before the repeats commence) cannot 
exceed n — 1. 


2.3.4 Repeating Decimals and Geometric Series 


Repeating decimals are examples of geometric series (about which more later). This 
is why they represent rational numbers. Look at the example 


0.05769230 
This is the number x that satisfies 


100x = 5.769230 


i 8 1 
= 5 + 769230 x (F feat sate +) 


(if you don’t know how to sum the series don’t worry; you'll learn this later) 


1 1 
= 3476093036 56 | 
7 ‘. (7) 


= 769230 


- 999999 
5769225 


~ 999999 


Perhaps unexpectedly this simplifies drastically to 8. So that x = a = S. This 


typifies the conversion of repeating decimals to ordinary fractions. 


24 2 Real Numbers 


2.3.5 Exercises 


1. Convert the following vulgar fractions to decimals. 


@@ 
(b) 95 
() 5 
@ F 


Note. The perspicacious reader may have noticed that the decimal algorithm is really the same 
as the long division algorithm as it was taught in primary schools two generations ago. This 
provides a convenient way to organise the calculation with pencil and paper. 

2. If you know how to sum a geometric series you can practise by converting the 
following repeating decimals to vulgar fractions. 


(a) 0.10 
(b) 0.510 
(c) 0.55101 


2.4 Subsets of R 


Subsets of the real line play a fundamental role in analysis. Most of the many equiv- 
alent versions of the completeness axiom involve properties of subsets of R. On a 
more elementary level, most functions treated in calculus are defined on a subset of 
R, which in many cases is not all of R. 


2.4.1 Intervals 


A subset of R is called an interval if it is of one of the following ten types (take note 
of the notation introduced in each line): 


(1) ja, bl:={* €R:a<x <b} 
(2) [a,b]:={x €R:a<x <b} 
(3) jJa,bl:={xEeR:a<x <b} 
(4) [a,bl:={x €R:a<x <b} 
(5) Ja,oo[:= {x €R:a < x} 
(6) J—oo, bl := {x ER: x <b} 
(7) [a,oo[ := {x €R:a <x} 
(8) J—oo, b] := {x ER: x <b} 
(9) J]—oo, cof := R 

(10) @ (the empty set). 
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The numbers a and b are called the endpoints of the interval (though not too; 
they are not numbers). Open intervals are ones that contain no endpoints, although 
they may have endpoints (items 1, 5, 6, 9, 10). Closed intervals are ones that contain 
all their endpoints (items 2, 7, 8, 9, 10), and they do this trivially if they have none. 

The notation for open intervals varies somewhat in the literature. The use of the 
reverse bracket is sometimes disparaged. Instead of writing “Ja, b[” it is probably 
more usual to prefer “(a, b)”, particularly in the English speaking world, with similar 
changes for the other types of interval. In fact both notations are specified in ISO 
standard 31-11. Moreover the reverse bracket is used by that most authoritative author 
Bourbaki. None of this particularly recommends it of course. However, there is a long 
history of using the expression “(a, b)” to denote the point in a coordinate plane with 
first coordinate a and second coordinate b. The same notation is customarily used 
for an ordered pair in set theory. 

From a practical point of view, the risk of misunderstanding the reverse bracket 
notation is near to zero. That is why it is preferred in this text. 


2.4.2 The Completeness Axiom Again 


The notation of intervals enables us to express the completeness axiom in a concise 
form. 


Axiom C1 restated 


Let a Dedekind section be given with left set D; and right set D,. Then there exists 
areal number f such that, either 


or 


2.4.3 Bounded Subsets of R 


As examples of subsets of IR we have seen the various intervals. From them we can 
form more subsets by intersection and union. Even so the variety of the set of all 
subsets of R is mind-dazzling, and there are many unsolved problems about them. 

We now define some important properties that subsets of R may possess; they 
involve the ordering of the real numbers. 


(a) Asubset A of R is said to be bounded above if there exists y € Rsuch that x < y 
for all x € A. A number y that has this property is called an upper bound of A. 
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(b) A subset A of R is said to be bounded below if there exists y € R such that y < x 
for all x € A. A number y that has this property is called a lower bound of A. 

(c) A subset A of R is said to be bounded if it is both bounded above and bounded 
below. 


2.4.4 Supremum and Infimum 


We prove an important consequence of axiom C1 that applies to arbitrary subsets of 
R that are either bounded above or bounded below. 


Proposition 2.3 


(1) Let A be a non-empty subset of R that is bounded above. Then the set of all 
upper bounds of A is an interval of the form [u, oof. 

(2) Let A be a non-empty subset of R that is bounded below. Then the set of all 
lower bounds of A is an interval of the form |—ov, v]. 


Proof It suffices to prove the result for case 1, from which case 2 is a simple deduc- 
tion. Let the non-empty set A be bounded above. Then there exists an upper bound y. 
Every z > y is also an upper bound. However, not every number is an upper bound. 
Since A is not empty there exists x € A, and then x — | is not an upper bound. Let 
D, be the set of all upper bounds and let D; be the complement of D,.. It is clear that 
the pair D; and D, form a Dedekind section. By axiom C1 there exists a number u, 
such that either u is the lowest element of D, or else it is the highest element of D). 

However, D; cannot have a highest element. There cannot be a number which is 
the highest of all numbers that are not upper bounds of A. For if v is not an upper 
bound of A there must exist x € A, such that v < x; and then the number 5(v + x) 
lies between v and x, is equal to neither, is not an upper bound of A, but is higher 
than v. 

We conclude that u is the lowest element of D,. The set of upper bounds is then 
the interval [u, oof. 


Exercise Derive case 2 of Proposition 2.3 from case 1. 


We can paraphrase Proposition 2.3 as follows. If a non-empty set of real numbers 
is bounded above, then among all upper bounds there is one that is lowest, often 
called the least upper bound. If a non-empty set of real numbers is bounded below, 
then among all lower bounds there is one that is highest, often called the greatest 
lower bound. 


Definition Let A be a non-empty subset of R that is bounded above. The lowest 
upper bound of A is called the supremum of A. It is denoted by sup A. 


Definition Let A be a non-empty subset of R that is bounded below. The highest 
lower bound of A is called the infimum of A. It is denoted by inf A. 
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The notions of supremum and infimum, which Proposition 2.3 enables us to define, 
are immensely important. The proposition is a consequence of axiom C1, but in most 
applications we use the proposition rather than the axiom. In fact this is probably 
the last time we use the axiom. Many mathematicians actually use Proposition 2.3 
as an axiom, an alternative to axiom Cl (to which it is in any case equivalent). More 
precisely it is the first part (stating that every non-empty set that has an upper bound 
has a least upper bound) that is often used as an axiom, the second part being an 
obvious corollary of the first. 


Exercise Show that the statement italicised in the last paragraph implies axiom Cl. 


There are dozens (and that can be taken literally) of equivalent formulations of 
the completeness of the real numbers. Probably the most common is the postulate 
that every non-empty subset of R, that is bounded above, has a supremum. It is a 
matter of psychology which is preferred, but in this text axiom C1 is preferred. It is 
doubtful if anyone has a mental picture of an arbitrary subset of IR. Maybe a sort of 
diffuse, one-dimensional cloud. Moreover an axiom that says something about all 
bounded subsets of real numbers seems to require the set of all subsets of R, which 
is a set vastly bigger than R itself. An axiom about all Dedekind sections is much 
simpler. A Dedekind section of R is both intuitive and highly graphic: we just cut 
the x-axis with a transversal line. 


2.4.5 Exercises 


1. Show that A is bounded if and only if there exists k > 0 such that |x| < k for all 
xeA. 

2. Determine which of the intervals listed 1-10 are bounded, bounded above, or 
bounded below. In each case find all upper bounds and all lower bounds. 

3. Let u < x. Prove that u < 5(u +x) <x. (This was a key step in the proof of 
Proposition 2.3.) 

4. It is important to acquire the feeling that Proposition 2.3 is not obvious. To help 
with this acquisition we can consider replacing R by Q. The notions of upper 
bound and lower bound can be defined for subsets of Q just as they were for 
subsets of IR. Consider the set A = {x € Q: x” < 2}. Show that A is bounded 
but has neither a least upper bound nor a greatest lower bound (in Q that is). 


2.4.6 Supremum or Maximum? 
Let us consider some examples of supremum and infimum. 


(a) Let A = [0,1]. Then sup A = 1. 
(b) Let B = [0, 1[. Then sup B = | also. 
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In the first case 1 is the highest element of A, or the maximum of A. In the second 
case B has no highest element, but it has a supremum. 

A non-empty set that is bounded above always has a supremum, but not necessarily 
amaximum, though if a maximum exists they are equal. The same applies to infimum 
and minimum. Confusion of the two caused many problems in the early days of 
analysis, and still does today among students of mathematics. Even the greatest 
mathematicians have come unstuck; Riemann’s failed proof of the existence of a 
solution to the Dirichlet problem involved the confusion of infimum and minimum. 


2.4.7 Using Supremum and Infimum to Prove Theorems 


Much of analysis consists of proofs that certain desirable objects exist. These are often 
solutions to equations. We have already considered the square root of 2. This is the 
positive solution to the equation x? = 2. It is just a simple example of a polynomial 
equation. The treatment we gave was rather clumsy; necessarily so since useful tools 
were yet to be introduced. 

Supremum and infimum are precisely such tools that enable us to mobilise the 
completeness axiom. When we want to prove for example that an equation has a 
solution, these tools can hand us a number that is a candidate for a solution. 

Arguments using supremum or infimum often proceed in the following way. Sup- 
pose that A is a set of real numbers that is bounded above. Let f = sup A. There are 
two important things we can say about ¢ and in most applications both are needed. 
Firstly, for all x € A we have x < ¢ (because ¢ is an upper bound). Secondly, suppose 
that e > 0. Then f — ¢ is not an upper bound (since ¢ is the lowest one). This means 
that there exists x € A such that t — ¢ < x. Another way of expressing this is to say 
that if y < ¢ there exists x € A such that y < x. 

The reader is invited to convince themselves that sup A is precisely the unique 
number ¢ that has these two properties. 

Consider again the square root of 2. Let us run through the argument again but 
this time we use least upper bound. We can let A be the set defined by 


A:={xeR:x>0, x <2 2t. 


Then A is bounded above; it is easily seen for example that 2 is an upper bound. 
So let t = sup A. Now we can show that #7 = 2. If r? > 2 then we can find by the 
argument of Sect. 2.2 (under “Square root of 2”) a number z such that 0 < z < f and 
2? > 2. It follows that z is an upper bound of A but is lower than f, in contradiction to 
the assumption that ¢ was the least upper bound. Again if t? < 2 we can find by the 
argument of the same section a number y such that t < y and y? < 2. Theny € A, 
so that f is not even an upper bound of A, again a contradiction. The only conclusion 
available (by trichotomy) is that t? = 2. 

Later we shall exploit continuity to carry out arguments like this one in a stream- 
lined fashion. 
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2.4.8 The Archimedean Property of R 


The property considered in this section further supports our intuition that the axioms 
adequately convey the notion of the real coordinate line. It may be viewed as the 
theoretical underpinning of the measuring tape. 


Proposition 2.4 For every real number x there exists a natural number n such that 
n>Xx. 


Proof Assume, to the contrary, that there exists a real number x, such that every 
natural number n satisfies n < x. The set N is then bounded above in R and therefore 
has a supremum tf. But then there exists a natural number m such that m > t — 1 
(otherwise ¢ would not be the least upper bound of N). It follows that m + 1 > t, and 
we have the contradiction that ¢ is not an upper bound of N. 


Here we have a result which, like Proposition 2.3, is apt to seem obvious and not 
needing proof. However it is impossible to prove the Archimedean property of R by 
using the axioms A and B alone. To show this one has to produce an example of an 
ordered field, a so-called non-Archimedean field, in which the set of natural numbers 
(always included in an ordered field) is bounded above. This can be done. 

Although axiom C1 was used in the proof of Proposition 2.4, it is not equivalent 
to it. Putting it differently we cannot use the Archimedean property as an axiom 
replacing axiom Cl and expect to produce all the desired properties of the real 
numbers, although dozens of usable replacements for axiom C1 are known. In fact 
the field Q alone has the Archimedean property, but contains no irrational numbers, 
and we have seen that at least some irrationals are needed for analysis. 

Nevertheless Proposition 2.4 supports the intuition that the real numbers, based on 
the axioms A, B and C, satisfactorily describe the coordinate line, in as far as the latter 
contains no points lying entirely to the right of all points with integer coordinates. 

We can go further in producing a satisfactory model of the coordinate line as we 
understand it intuitively. A simple consequence of Proposition 2.4 is that if x is a 
positive real number, there exists a highest natural number, usually denoted by [x], 
that is lower than or equal to x. This number is 7 — 1, where n is the lowest natural 
number strictly greater than x (which exists by Propositions 2.4 and 2.1). Finding it is 
the first step in constructing the decimal representation of a number x. Furthermore it 
is easy to extend the function [x] to all real numbers (see the following exercises). For 
every real number x there exists a unique element m of Z, such thatm <x <m-+l. 
Every real number falls within an interval whose endpoints are consecutive integers. 


2.4.9 Exercises (cont’d) 


5. Find the supremum, infimum, maximum and minimum of the following sets of 
numbers, whenever they exist, justifying your conclusions: 
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10. 


11. 


2 Real Numbers 


(a) [0, 1] 

(b) 0, 1[ 

(c) {x € R: x? <2 and x is rational} 
(d) {x € R: x? <2 and x is rational} 
(ec) {x €R:x? <4 and x is rational} 
(f) {x €R: x? <4 and x is rational} 
(2) {xeER:x?+x+1<0} 

(h) {x €R:x*+x-—1 <0}. 


. Let A and B be subsets of R, both bounded, and suppose that A C B. Show that 


sup A < sup B and inf A > inf B. 


. Let A be a set of real numbers with the following property: whenever a and b are 


distinct points of A anda < x < b then x is in A. Prove that A is an interval. 
Hint. In the case when A is bounded and non-empty one can let uv = inf A and 
v = sup A. Show that A is then one of the intervals with endpoints u and v. This 
result is often useful. 


. Show that if we work only with the rational numbers (for which intervals can be 


defined by the same formulas as were used in the case of the real numbers) then 
the conclusion of the previous exercise fails in general. 
Note. The statement of the previous exercise has sometimes been proposed as a completeness 


axiom, that is, as a replacement for axiom Cl. 


. Extend the function [x] (as defined above for positive x) to all real arguments. 


Show that one may unambiguously define [x] as the unique integer such that 
[x] <x <[x]+1. 

Leta > 0. Show that for every real x there exist a unique integer m and a unique 
number r, such thatO < r <aandx =ma-+r. 

Let A Cc Z. Show that if A is bounded above it has a highest element, and if A 
is bounded below it has a lowest element. 


2.5 Approximation by Rational Numbers 


We can interpret Proposition 2.4 as saying that there are no infinitely large real num- 
bers, that is, no numbers that exceed every natural number. The counterpart to this 
is that there are no infinitely small numbers either, that is, no real numbers that are 
at the same time positive and are smaller than | /n for every natural number n. This 
is essentially a denial that infinitesimals exist in the real realm. 


Proposition 2.5 For each real number ¢ > 0 there exists a natural number n such 
thatO < 1/n <e. 


Proof By Proposition 2.4 there exists a natural number n such that n > 1/e. But 


then 0 < 1/n <e. 


Proposition 2.6 Each real number x is the supremum of the set of rational numbers 
that are lower than x. 
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Proof Let A= QN ]—on, x[ (that is, A is the set of rational numbers lower than 
x). It is clear that x is an upper bound of A and we want to show that it is sup A. 
Suppose, on the contrary, that sup A, let us call it ¢, is actually lower than x, and let 
us derive a contradiction. Then x — ¢ > O and we can find a natural number n by 
Proposition 2.5 such that 0 < 1/n < x — t. Since t = sup A, there exists a number 
r € A, such that t—1/n <r <t. But then r+ 1/n is rational, r+ 1/n >t and 
r+1/n<t+1/n < x. That is, r + 1/n is an element of A that is strictly above f. 
This contradicts the fact that ¢ is an upper bound for A. 


Proposition 2.7 Between any two distinct real numbers there exist infinitely many 
rational numbers. 


Proof Let x < y. We create an increasing sequence of rationals between x and y in 
order to build the required infinite set. Sequences will be studied thoroughly in the 
next chapter. 

We first show that there exists a rational number r; such that x < r; < y. Con- 
sider the midpoint z = $(x + y), which satisfies x < z < y. By Proposition 2.6, z 
is the supremum of the set of all rational numbers less than z. Hence there exists a 
rational number 7; such that x <r, < z. We therefore have x <r; < y. Repeating 
this argument using 7; instead of x we find a rational rz such that 7; < ro < y, then 
a rational r3 such that r2 < 73 < y, and so on. In this way we construct an increasing 
sequence of rationals r,, = 1, 2,3,..., such that x <r, < y. The elements of this 
sequence form an infinite set. 


These propositions assure us that we may always approximate a real number by 
rational numbers, and make the error of the approximation as small as we like. This 
leads into the notions of limit and limit point, taken up in the next chapter. They 
reinforce our impression that the real numbers are a good model for the coordinate 
line, where adding more points increases accuracy of measurement. Technically we 
say that the rational numbers are dense in the set of real numbers. 

All of this is important for practical calculations with real numbers. One way to 
approximate a real number is to use the decimal representation (or the representation 
in another number base). However, there are other ways that may be more precise. The 
subject of approximation by rational numbers is called diophantine approximation 
and is considered a branch of number theory. 


2.5.1 Exercises 


1. Leta and b be real numbers such that a < b. Show that a rational number between 
a and b can be obtained by the following argument, which is perhaps more 
intuitive than the proof of Proposition2.7. Let n be a natural number, such that 
0 < 1/n < b —a, and let m be the lowest integer (possibly negative) such that 
na < m. Show that a < m/n < b. 


32 


2A 


2 Real Numbers 


Before the advent of calculus, arguments were used that involved neglecting 
quantities small with respect to other quantities. So for example, if a quantity h 
appeared in a formula, and / was small (on some scale) then h might be neglected 
as being super small. This led to useful methods of approximation. 
As an example, if x is an approximation to ./a, we can write x + h for the actual 
d/a. Then (x +h)? =a so that x? + 2hx + h* = a. If we suppose that the term 
h? can be neglected we might conclude that / is approximately (a — x”)/2x, or 
rather suggestively 

hla x 

x 2 x 
In words we have the well known rule that the proportional deficit of x is approx- 
imately half the proportional deficit of x7. 


(a) Derive a similar rule for approximating a cube root: the proportional deficit 
of x is approximately one-third the proportional deficit of x3. 

(b) Approximate the cube root of 1729. With 12 as a first approximation use 
mental arithmetic to get a better approximation, which should prove to have 
three correct digits after the decimal point and the fourth very nearly correct. 
Note. This example is the basis of an amusing anecdote in “Lucky numbers” in the book 
“Surely you must be joking Mr Feynman”. 


Let k, and k2 be real numbers, neither of which is 0. 


(a) Show that if the set of all numbers of the form mk, + nk, as m and n range 
over the integers, contains a lowest positive number a, then k; and k2 are 
both integer multiples of a. 

(b) Suppose that k;/k is irrational. Prove the following: for each ¢ > 0, there 
exist integers m and n such that 0 < mk; + nkz < «. 


There exist ordered fields (non-Archimedean fields) in which the natural numbers 
are bounded above. Of course they do not satisfy axiom C1. To describe one we 
will need a little algebra. The field we shall describe, let us denote it by F’, consists 
of all rational functions in one variable with real coefficients. These have the form 
of fractions 

Pmx"™ + Pm—1x"—! Hee aisoele Po 


GnX" + Gn—1x"-! +--+ +40 


where the coefficients po,...,Pm>0>---sGn are real. The algebraic operations are the 
usual ones for fractions. The element x should not be thought of as a number; it is 
an indeterminate, and outside the system of real numbers. To describe the ordering 
we suppose that the fraction is written in lowest terms, that is, the numerator and 
denominator have no common factors except scalars (polynomials of degree 0). 
The fraction f/g is defined to be positive if the leading coefficients p,, and gy 
have the same sign. Then f < g is supposed to mean that g — f is positive. 
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(a) Show that the relation f < g is an ordering in the sense of axioms B. 
Hint. It can help to use Sect. 2.2 Exercise 6. 

(b) The natural number ¢ is identified with the zero-degree polynomial £. Show 
that for all natural numbers ¢ we have ¢ < x. Hence the element x is an 
upper bound for the set N. 

(c) Show that for all natural numbers £ we have 0 < 1/x < 1/€. So we can 
think of 1/x as an infinitesimal. 

(d) Find a non-empty subset of the ordered field F that is bounded above but 
does not have a least upper bound. 


Chapter 3 M®) 
Sequences and Series cies 


To those who ask what the infinitely small quantity in 
mathematics is, we answer that it is actually zero. Hence there 
are not so many mysteries hidden in this concept as there are 
usually believed to be. 


L. Euler 


3.1 Sequences 


What are sequences? How do we talk about them? We have already seen examples 
of sequences, such as the Fibonacci numbers (studied in Sect. 2.2 Exercise 9), or the 
decimal digits of a real number. As always for the most important ideas, we have a 
choice of ways to express ourselves. 


3.1.1 The Notion of Sequence 


A scientist might make a table of measurements such as 


ee 
0.707|0.577| 0.5]0.447 |0.408/0.378|0.354 


We can easily read from this the first measurement or value, the second and so on. We 
can read the n'” value for each natural number n from 1| to 7. These natural numbers 
have the role of place numbers. Each measurement has a place number. 

We can also exhibit this example as a coordinate vector, without showing the place 
number, but allowing us to infer the place number by counting from left to right: 
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(0.707, 0.577, 0.5, 0.447, 0.408, 0.378, 0.354). 


For each natural number n from | to 7 we let a, denote the number with place 
number n in the above table. We have here an example of a sequence with 7 terms. 
We express this as follows: (a,)/_, is a sequence for which a, is given by the value 
with place number 7 in the table. The use of ordinary parentheses, with subscript 
and superscript showing the range of the place numbers, is quite typical here. Curly 
brackets are sometimes used, but should be discouraged, as they usually specify sets, 
in which the order of elements is immaterial and repetitions irrelevant (recall the 
discussion in Sect. 2.1). 

The value with place number n is called the n" term of the sequence. In a finite 
sequence there is a first term, a second term, a third term and so on up to some place 
number JN. Different terms may have the same value, though the place numbers are 
different. 

We can also specify a sequence using a formula. For example, let (a,)/_, be a 
sequence for which a, = 1//n + 1 for each natural number n from | to 7. We have 
a formula instead of a table. In this example there is no reason to stop at 7 terms. We 
can continue indefinitely. We then have an infinite sequence (a,)7°,. 

It bears repeating that a sequence should not be confused with a set. For example, 
the coordinates of a point in coordinate geometry form a sequence. The points (1, 2) 
and (2, 1) in a coordinate plane are distinct sequences each with two terms. Their 
values constitute the same set of numbers {1, 2}. Again, the sequence (1, 1) is not 
the same as the set {1, 1}; the latter is the same as {1} and has only one element. 
This demonstrates again the different uses of parentheses and curly brackets—for 
the former the order matters; for the latter neither order nor repetitions make any 
difference. 

We can give a reasonable working definition of sequence as follows. Here it clearly 
appears why the natural numbers must constitute a set. 


Definition A sequence is an assignment of an element to each natural number of a 
given range, called the index set. 


The index set is usually the set of all integers n such that 1 < n < N, giving rise 
to a finite sequence, or the set N,, giving rise to an infinite sequence. We shall often 
speak of a sequence (a,) ue (in the case of a finite sequence), or a sequence (a,)°° , 
(in the case of an infinite sequence), prior to explaining how the value a, is to be 
assigned to the place number, or index, n. 

We are being deliberately vague over the precise meaning of assignment. It could 
be a formula. It could be a table. However, we want to admit other cases when we 
do not have a practical method to calculate a, for a given n. Therefore we avoid 
speaking of a “rule”, which has connotations of a formula, suggesting a practical 
procedure. 

We saw that a sequence is not the same as a set. Values appearing as terms in a 
sequence can be repeated. Sometimes we wish to consider the set of all values that 
appear as terms in a sequence (a,)_, or (dn)°2,. There is a notational trick. We 
write the set as 
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{a,:1<n<N} or {a,:1<n <co} 


depending on the case. When considered as a set, the elements have forgotten their 
place numbers. 
Consider some examples of sequences, in order of decreasing practicality. The 
first is the sequence 
(1, 1.4, 1.41, 1.414, 1.4142, ...) 


of approximations to /2 obtained by truncating its decimal representation. It is not 
easy to write down a formula for the n" term of this sequence, but the term can be 
calculated by a fairly easy algorithm. 

The second is the sequence (a,)°°., where a, = | ifn is prime and a, = 0 ifn is 
composite. This is an infinite sequence because the index set is infinite. Even so each 
term is either 0 or 1, so that the values form a finite set. There seems at present to 
be no practical way to calculate a, since it asks us to determine whether or not n is 
prime, a problem that for large v is still prohibitively costly to solve on a computer. 

The third example is to let a, = 1 if the equation x” + y” = z” has a solution for 
which x, y, z are positive integers, and a, = 0 if there is no such solution. Before 
1995 it was known that a; = 1, ay = 1 and a, = 0 for a known but finite number 
of places n. It was suspected that a, = 0 for all n > 3 but no practicable way was 
known to calculate a, for a given n. In 1995 Andrew Wiles published the proof of 
Fermat’s Last Theorem, as a result of which we now know that a, = 0 for all n > 3. 

A common variant of sequence, that we shall make much use of, is to allow the 
place numbers to start at n = 0, with ap instead of a,. We then have a sequence 
(an)°°.9 for example. This is useful when n denotes discrete time, beginning at the 
instant n = 0. 

A further variation is to admit negative place numbers. For example (a,)!°_ jo, 
or a sequence infinite on both sides, (a). 

In a later chapter we will define functions, like sequences, by using the undefined 
term “assignment”. It turns out that the notion of assignment can be defined by 
set theory, and then what assignments are admissible depends on what set building 
axioms we wish to allow, or whether we want to allow some really weird sets. It is 
therefore still possible for two people to disagree over what constitutes an admissible 
assignment. But it does mean that sequences and functions are both examples of the 
same thing. Our mental pictures may still be quite different. 


3.1.2 Defining a Sequence by Induction 


We can define a, in terms of a, fork = 1, 2, ..., — 1. This could have more practical 
value than a formula for a,. We need to specify some initial values to get the process 
started. 
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Example The Fibonacci numbers constitute a sequence (a,)°°, defined by letting 
a, = 1,a. = 1 and 


An = An-1 + Qy-2, (n = 3, 4, 3; dedi 


Using this one can generate the Fibonacci numbers more simply than by using the 
formula for the n" Fibonacci number obtained in Sect. 2.2 Exercise 9. 


Example Let a; = 1 and 


ay = Van-1 + 1, (n = 2). 


In these examples the sequence is said to be defined by induction, or by recursion. 
We have a rule for finding a, from the terms a, with place numbers k < n, and an 
appropriate set of initial values to start the ball rolling. 

Many important operations on a finite number of objects, that we often take for 
granted, are defined formally by induction. We can name the sum of n numbers, the 
product of n numbers, the union of 7 sets and the intersection of n sets. 


3.1.3 Infinite Series 


Let (a,)°°, be an infinite sequence of real numbers and define inductively a new 
sequence (s,)°, by 


Sy =a, and S,=S,1+G,, (n= 2). 
Informally we sometimes write 
Sn =A, + a2 +-+-+++ay 


but the formal symbol (which we have already used several times and is doubtless 


familiar to the reader) is 
n 


Ss, = ) ak. 


k=1 


But what can 


mean? It seems to ask us to add infinitely many numbers and on the face of it means 
nothing. Whether or not it means anything such an expression is called an infinite 
series. 
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3.2 Limits 


One of the greatest achievements of analysis was to give the expression )~7~_; ax a 
precise meaning so that mathematicians could obtain valid and trustworthy conclu- 
sions. The key to this is the concept of limit. After the elucidation of the nature of 
the real numbers, the notion of limit is the second main foundation stone of analysis. 
So it is worth devoting some time and effort to understanding it. We first define the 
notion of limit of a sequence, and develop it at some length before applying it to 
infinite series. 
Let (a,)°°., be an infinite sequence of real numbers. 


Definition A number ¢ is called the limit of the sequence (a,)7°., if the following 
condition is satisfied: 


For every € > 0, there exists a natural number N, such that \a, — t| < € for 
alln>N. 


The inequality |a, — t| < ¢ appearing in this definition is equivalent to 
t-—€<a<t+é6, 


and for practical purposes (such as carrying out a proof) this may be by far the most 
convenient form. It may also be written, using set theory, as 


a, € |t-—e,t+el[. 


We write 
lim a, = t 
n—> oOo 
to denote that ¢ is the limit of the sequence (a,)°° ,. 

The limit ¢ may, or may not, be equal to a, for some n. There may even be 
infinitely many places n, such that a, = t; or there may be none at all. This may seem 
an obvious point, but early thinking about limits often assumed that a limit might 
not be a value of the sequence of which it is a limit. The sequence was supposed to 
approach arbitrarily near to its limit without ever reaching it. This thinking was an 
early obstacle to finding the correct definition of limit. 

Not all sequences have limits. A sequence that has a limit is said to be convergent. 
We also say that a, converges to t, or that a, tends to t, and sometimes write a, — f¢. 
Thus when we say that a sequence (a,)°° , is convergent, we mean: there exists a 
real number f, such that lim,_,9 a, = f. 

Often we are dealing with particular sequences where we have a formula. Exam- 
ples are 

1 2 n+1 
Q,=-, G,=2", or a, = —. 
n n 
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In such cases we do not write “lim, dy where a, = 1/n”, and so on (though we 
might), but instead (for these examples) 


1 n+1 
lim -, lim2™”, lim : 
n>on n—>0oo n>oo n 


A sequence that does not have a limit, that is, a sequence that is not convergent, is 
often said to be divergent. Strictly speaking, when we are presented with a sequence 
(a, )°o ,, it is not good mathematical grammar to write the formula “limy_, 49 dy” until 
we have ascertained that the limit exists. The expression “lim,-,.9” is not a function 
symbol that can be stuck in front of an arbitrary sequence. This is perhaps a little 
awkward, but there are ways to alleviate this (see the nugget “Limit inferior and limit 


superior’). 


3.2.1 Writing the Definition of Limit in English, and in Logic 


The property of a sequence embodied in the definition of limit involves quantifiers, 
that is, the expressions “there exists” and “for all”, in fact three of them. It seems 
that there is no escaping this, and it is perhaps the reason why the correct definition 
of limit was so long emerging. It is also a significant challenge to define the property 
in ordinary language. 

The definition of lim,_,.. a, = t is written in natural English that contains some 
in-line mathematical formulas. It is not hard to produce a non-mathematical sentence, 
again in natural English, that has the same syntactical structure and the same logical 
structure: 


In every class was a pupil, who obtained full marks in all their examinations. 


The definition of limit can be expressed entirely in a mathematical language called 
first-order logic. It has the form (slightly simplified for greater ease of reading): 


(Ve > O)(AN EN)Wn EN)(n > NS |a, —-t| < €). 


A literal translation of this into rather unnatural English, keeping to the same phrase 
order, might be 


Foralle > O there exists anatural number N, such that, for all natural numbers 
n,n > N implies |a, —t| < e. 


It will be noticed that in the logical sentence all quantifiers (“there exists”’, “for all’, 
symbolised by “2”, “V’’) are placed at the front, whereas in idiomatic English one of 
them is placed at the end. The order of the quantifiers is most important. 

Let us examine the example of the pupils who obtained full marks. This may be 
expressed using formal quantifiers by 
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(Va) (Ab) (Vc) (F (a, b, c)). (*) 
Here we are using “a” for a variable that ranges over all classes, “b” for a variable 
that ranges over all pupils and “c” for a variable that ranges over all examinations. 
The expression “F(a, b, c)” stands for 


b was in class a, and if b took the examination c they achieved full marks in it. 


The conditional clause (introduced by “if” ) is needed because it is not implied that 
all the pupils took the same examinations. But it has a slightly odd implication. In 
logic the claim: 


if b took the examination c they achieved full marks in it 


which is a part of F(a, b,c) is true if b did not in fact take the examination c. 
So the sentence (*) is true in the (perhaps exceptional and possibly unintended) 
circumstances that in every class was a pupil who took no examinations at all, because 
it is true that such a pupil got full marks in all that they took. In mathematics one 
has to be on the lookout for such circumstances, where a statement can be true by 
default because it directs one to test something of which there are no instances, like 
the examinations taken by that absentee pupil. 

Whilst the phrasing of the definition of limit given in the last section was common 
in the early twentieth century (Hardy, Whittaker and Watson, Burkill), in the latter 
half of the century there began a tendency to write the quantifiers in the English 
sentence in the position in which they occur in the formal logical definition. It even 
seems that the older phrasing of the definition is disparaged. It may be thought that 
placing a quantifier at the end of the sentence, as is common in idiomatic English, is 
ambiguous, as it is arguably not clear where it should be placed in the precise first- 
order logical sentence that is supposed to express the same idea. A misunderstanding 
could arise that the intended logical sentence is 


(Vn EN)(Ve > O)(AN EN)(1n => N => lay —t| < €). 


This proposition is completely different from the condition for a limit. It is trivially 
true, since given n and ¢ we can choose WN so that n < N, and then the proposition 
“n > N => |a, —t| < e” is true, because in propositional logic an assertion of the 
form “p = q” is true if p is false, as we saw in the case of the pupil who took no 
examinations, but got full marks in all that they took. It therefore says nothing about 
the sequence (a,)°° ,. 

In fact the risk of this misunderstanding of the definition is negligible, and is 
rendered even more so by placing a comma after “N’”’, which ties the quantifier “for 
alln > N” firmly to the formula “|a, — t| < e”. 

Nevertheless let us look at ways that have been used to make the literal translation 
of the logical formula more natural whilst keeping the order of its mathematical 
elements. 

The problem lies in the awkward juxtaposition of two formulas “n” and “n > N”. 
This must be avoided. Here is one way: 
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Foralle > O there exists anatural number N, such that, for all natural numbers 
n> N we have |a, —t| < &. 


The phrase “we have” is perhaps acceptable but is rather artificial. Even so we shall 
sometimes use this type of formulation for the sake of variation. Here is another way: 


Forallé¢ > 0 there exists anatural number N such that, for all natural numbers 
n, if n> N then |a, —t| <&. 


The juxtaposition “n, ifn” jars but is perhaps acceptable. Some writers adhere to this 
form but reduce the jarring effect by using a centred display, even going so far as to 
enclose it in a box (though not shown here): 


For all ¢ > 0 there exists a natural number N, such that, for all natural 
numbers n, 
ifn > N then la, —t| <e. 


This is again rather artificial. Yet another way is to drop the third quantifier entirely, 
whilst implying its presence: 


For all ¢ > 0 there exists a natural number N, such that if n > N then 
lan —t| <e. 


This is acceptable if we make the quite reasonable assumption that the reader under- 
stands from the context that “ifn > N then |a, — t| < e” means “for alln, ifn > N 
then |a, — t| < €”. However failure to understand it in this way is hardly less plau- 
sible than a misunderstanding of the older phrasing of the definition. 

In fact nobody would dream of saying: 


In every class was a pupil, such that in all their examinations they got full 
marks. 


3.2.2 Limits are Unique 


If a sequence has a limit then it has only one. This was implicit in the way we talked 
about limits in the last section. We must show that two distinct numbers ¢ and s 
cannot both be limits of the same sequence (a,)°,. In fact if s 4 t we can choose 
é > 0 smaller than |s — t|/2. Then there is no number common to the intervals 
jt — e,t + e[ and ]s — ¢, s + €[. So it is impossible that there could exist N; and Nz 
such that a, € |Jt — e,t + e[ foralln > N, anda, € ]s —¢,s+e[foralln > Np. 


3.2.3 Exercises 


1. Write the argument of the last paragraph using inequalities, that is, show that 
la, — t| < € and |a, — s| < € cannot be simultaneously true if ¢ is chosen as we 
described. 
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2. Check the assertion that the inequality |x — a| < € is equivalent to the two 
inequalities a — ¢ < x < a+. This reformulation is often helpful. 

3. Find some other examples of non-mathematical assertions that require three quan- 
tifiers. 


3.2.4 Free Variables and Bound Variables 


In the formula 
— nti 
lim 


n>o n 


the variable n can be replaced by any other letter without changing the sense or value. 


Thus 
_ n+l _ Bl 
lim = lim ——, 
n>o Nn pooo 8 


provided it is understood that the variables n and 6 range over the natural numbers. 
It is not just that the quantities are equal, the expressions have the same meaning. 

In the formula 1/n we can put in values for n (from the realm of natural numbers, 
for example) and calculate in this way instances of the formula. The variable n is 
free. But putting “lim,...” before “1/n” ties n down. It makes no sense to ask: 
“What is the value of lim, 1/n for n = 5?” The variable n has become bound 
by the prefix “lim,_,..”. It is for this reason that any other letter may replace “n” 
without changing the meaning. 


Another example of bound and free variables is the expression 


n 


Se. 


k=1 


Here k is bound but n and p are free. We may replace k by any other letter, except 
for n and p as they are in use, and the meaning remains unchanged. But although we 
may ask about the value of this expression for p = 2 and n = 10, there is no sense 
in asking about its value for p = 2,n = 10 andk = 5. 

The test for a free variable is to ask yourself whether the following question makes 
sense: 


What is the value of the expression if I substitute | (or some other value within 
the allowed range) for the variable? 
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Here are some other commonly occurring expressions involving bound variables: 


(a) max dx The maximum of the numbers a1, 2, ...,d,; the variable k is 
1<k<n : 
bound, but n is free. 
(b) min a, The minimum of the numbers aj, do, ...,a,; the variable k is 
1<k<n . 
bound, but n is free. 
(c) sup ay The supremum of the set of terms of the infinite sequence 
1<n<oo lee) 
(An)nmI : 
(d) inf ad, The infimum of the set of terms of the infinite sequence 
1<n<oo (a yee 
n/n=\" 


A slightly different case is that of logical statements involving quantifiers. An 
example is 
(Vx)(x<y>x <1). 


This says, in words, that for all x, if x is a real number less than y, then x is less 
than |. Here x is a bound variable; any other letter may replace it. But y is a free 
variable, for it makes sense to ask whether the statement is true or false for given 
values of y. For example, is it true or false for y = 0, or 1, or 2? 


3.2.5 Proving Things Using the Definition of Limit 


In order to prove that lim, a, = t directly from the definition of limit, we have 
to produce, or show that it is possible to produce, for each ¢ > 0, an integer N, 
such that for all n > N we can show that |a, — t| < ¢ (or something equivalent to 
this inequality). What does this mean in practice? What is required in an acceptable 
argument that is supposed to justify that a sequence has the limit t? 

It is extremely important to write out the argument so that the reader (equipped 
with enough mathematical knowledge) can convince themselves that it is correct. It 
is often not enough for the writer to be convinced. This applies to all mathematical 
writing of course. 

The most important thing is to demonstrate that something is possible for every 
positive real number ¢. To accomplish this it is best to write clearly at the beginning 
of the argument: 

“Let ¢ > 0.” 


(including the full stop), and if not at the beginning (see below for a discussion of 
this), at least starting a new paragraph or in some way giving it prominence. 

The next thing is to produce an N. Some ingenuity may be needed, or some 
informed guessing. In any case one must verify that just in virtue of n > N it will 
follow that |a, —t| <e. 
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Finally it must be made apparent that you can produce such an N for any given 
positive ¢. Here it is generally enough to make it clear that no special assumptions 
about ¢ are needed (apart from its being positive). 

Actually that last remark can be modified in the light of good sense. For example 
it might be alright to assume (if it helps; and it often does) that e < 1. The reason 
is that, if an N works for one ¢, it will work for any bigger e. In the same way we 
might want to reduce ¢ to below some value (if it is not already below it) if it helps 
to produce NV. 

An observation similar to that of the last paragraph: if N works for a given ¢, then 
any number bigger than N would work as well (as an alternative N so to speak). This 
raises the question of whether we want to regard N as a function of ¢; that is, should 
N be seen to be determined by ¢? The answer is, that as far as the definition of limit 
is concerned, we do not have to display N as explicitly determined by e. It is true 
that sometimes we may want this, for example if we wish to obtain explicit estimates 
of errors; but that is another issue, beyond that of proving that a sequence has the 
limit t. To produce N we may make some arbitrary choices along the way, without 
defining them explicitly. For example if we know that a certain set of numbers is not 
empty we can choose an element of it, without further explanation. And of course, 
the lowest N that works is uniquely determined by ¢, though it may be impractical 
to give a formula for it, and, as we have seen, it is not needed to prove that a limit 
exists. 

It is often a good idea to place some preliminary observations before introducing 
é€, but they should not refer to ¢ of course. For example one might want to construct 
a sequence b,,, such that |a,, — t| < b,. Then in the body of the argument we might 
produce N, such that b, < e¢ for all n > N, if that should be simpler than dealing 
directly with |a, — tf]. 


3.2.6 Denying That limyn-. oo an = t 


Proof by contradiction is an important mathematical tool, in fact one can nearly 
define mathematics as the domain of human discourse where proof by contradiction 
is completely accepted. One assumes that the conclusion of a proposition is false and 
deduces from this assumption a false statement, or a statement inconsistent with the 
proposition’s premises. 

Imagine that the conclusion is the statement lim,-... a, = t. To prove this by 
contradiction we have to begin by negating the statement lim,-, 4. a, = t. What does 
this entail? 

It is not simply equivalent to asserting that the limit of a, is not f, for, literally, 
this asserts that a, has a limit but the limit is not t. Rather we wish to assert the 
following: either the limit does not exist, or it exists and is not f. 

To say that the limit exists and equals t means that for every e that is positive 
we can accomplish a certain task. So to deny this means that there exists ¢ that is 
positive, and for which the task cannot be accomplished. This could be established by 
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exhibiting just one value ¢ for which the task is impossible. But what is the task? It is 
to produce N that has a certain property. Therefore to say that the task is impossible 
means that for every N the property in question is not available. What is the property 
that NV may or may not possess? It is that for alln > N we have |a, — t| < &. So for 
N not to possess this property entails that there exists at least one n, such thatn > N 
and |a, —t| > e. 

Finally we can set out what it means to negate the sentence limy-,.o d, = ft. It 
means the following: there exists ¢ > 0, such that for all natural numbers N there 
exists n > N, such that |a, — t| > e. 

This can be written as a sentence of first-order logic (somewhat simplified to make 
it readable): 

(de > O)WN EN)Gn)(n = NA la, —t| > 8). 


3.2.7. Two Fundamental Limits 


We now calculate two very important limits. The proofs are our first real use of the 
definition of limit. 


Proposition 3.1 
1 
(1) limp +00 —=0. 
n 
(2) IfQ0<x <1 then lim,... x" = 0. 


Proof of the First Limit Let ¢ > 0. By Proposition 2.4 there exists a natural number 
N such that N > 1/e. Ifn > N we have 1/n < 1/N < ¢€ so that |1/n| < e¢. 


We do not have to produce the lowest N that works, although here that would be 
easy. Just take as N the lowest natural number greater that 1/e. 


Proof of the Second Limit First some preparation. Let 0 < x < 1. Then 1/x > 1 
and we let 1/x = 1 +h where h > 0. By the binomial rule (Sect. 2.2 Exercise 11) 


1 
—=(1+h)">1+nh 
x” 


(we drop all terms except the first and second). Therefore 


O<x" < 


l+nh 


Now we tackle the limit. Let e > 0. To guarantee that 0 < x” < ¢ it is enough to 
have 1/(1 + nh) < e, and that will be the case ifn > N, where N is the least natural 
number above ((1/e) — 1) /h. 


The N we found for the given ¢ was much bigger than was necessary since 
(1 +h)” is far above 1 + nh. But we did not need to find the smallest N that works. 
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3.2.8 Bounded Sequences 


Various notions of boundedness for sequences parallel the corresponding notions for 
sets of numbers, as defined in Sect. 2.5: 


(a) A sequence (a,)°°, is said to be bounded above if there exists K, such that 
dyn < K forall n. 

(b) A sequence (a,)P°., is said to be bounded below if there exists K, such that 
a, > K forall n. 

(c) A sequence (a,)°°, is said to be bounded if there exists K such that |a,| < K 


n=1 
for all n. 


Obviously a sequence is bounded if and only if it is both bounded above and bounded 
below. As usual it is sometimes convenient to write the inequality as —K <a, < K, 
or using set theory as a, € [—K, K]. 


Proposition 3.2 A convergent sequence is bounded. 


Proof Assume that limy_.o0 a, = t. Apply the definition of limit with e = 1. There 
exists NV, such that |a, — t| < 1 foralln > N. But then |a,| < |t|+ 1 foralln > N. 
Choose k = maxj<n<y-—1 |@,| and we then have for all n that |a,| < K where K = 
max(k, |¢t| + 1). 


3.2.9 The Limits oo and —co 


Definition The sequence (a,)°°_, tends to infinity (or has the limit 00), and we write 
limy-—+oo An = 00, if the following condition is satisfied: 


For each real number K there exists a natural number N, such that a, > K 
foralln>N. 


Definition The sequence (a,,)°° , tends to minus infinity (or has the limit — oo), and 
we write lim,_, 55 d, = —00, if the following condition is satisfied: 


For each real number K there exists a natural number N, such that a, < K 
foralln>N. 


A sequence with limit oo or —oo is not considered to be convergent. We sometimes 
say that it is divergent to oo, or —oo. The elements oo and —oo are not numbers, but 
they may be limits. We still say of such a sequence, though not convergent, that the 
limit exists. 

Often, instead of saying “The sequence (a,)°° ; is convergent” we say “The limit 
limy_—+o0 A, eXists and is a finite number”. Here “finite number” means the same as 
“real number’’, or, “neither oo nor —0o”’. 

In general a sequence is said to be divergent when it has no finite limit. This 
includes oscillating sequences such as a, = (—1)”, as well as a vast assortment 
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of wild behaviour, but also sequences that tend to oo or tend to —oo. In this way 
“divergent” is genuinely the negation of “convergent”. 


3.2.10 Exercises (cont’d) 


4. For each of the following limits find a suitable N for each positive ¢: 


1 
- te 4 
noo n— 1 
— neti 
ee 
2 
—n+1 
(dy. fan ng 


(ec) lim Vnt1—Vn=0. 


Hint. There is a trick for doing this using the mathematics teacher’s favourite 
identity: (a+ b)(a — b) = a* — b. Itcan often be used in connection with 
square roots to avoid the use of continuity arguments. Write 


Jot ya =) 


vn+1+/n 
. 1 
@f) lm,/l+—-=1 
noo n 
(g) lim Jh, = Ja where limy_so9 hy = a anda > 0. 


noo 


5. Give a precise proof that the sequence a, = (—1)” is divergent and has neither 
the limit oo nor —oo. In brief, it has no limit. 

6. Show that if a > 1 then lim,-... a” = ©. 

7. Show that if a > 0 then lim,_,..a!/" = 1. 

Hint. lf a > 1 write a!/" = 1 + b, and estimate b,. 

8. Show that lim,_...n!/" = 1. 

9. Let A be a closed interval (recall that there are five types of closed intervals; 
see Sect. 2.4). Let (a,)°°, be a sequence in A that is convergent and let f be its 
limit. Show that ¢ is in A. Prove also the converse of this: if A is an interval that 
is not closed, then there exists in A a convergent sequence whose limit is outside A. 


Note. This probably explains the appellation “closed”. You cannot exit a closed interval by 
going to the limit of a convergent sequence lying in the interval. 
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3.3 Monotonic Sequences 


Although a convergent sequence is always bounded, it is far from the case that a 
bounded sequence is always convergent. There is though one important case when a 
bounded sequence is convergent. It gives us a tool that is used over and over again; 
a simple instance of the structure of a sequence ensuring its convergence. Its proof 
is perhaps the first really important application of supremum and infimum. 


Definition A sequence (a,)° , is said to be increasing if a, < ay4, for all n. It is 
said to be decreasing if a, > ay+1 for all n. A sequence that is either increasing or 


decreasing is said to be monotonic (or monotone). 


We shall sometimes refer to a sequence as being strictly increasing. This will 
mean rather obviously that a, < d,41 for all n, similarly for strictly decreasing. 
A sequence can be both increasing and decreasing, if it is constant; a fact that we 
just have to live with. The terminology varies somewhat, but we want the simpler 
terminology for the more frequently encountered cases; and they are that a, < ay+4 
for each n, or dy > Ay+41 for each n. 


Proposition 3.3 Let (a,)°° , be a monotonic sequence. Then exactly one of the fol- 
lowing is the case: 


(1) It is convergent. 
(2) It tends to +00. 
(3) It tends to —oo. 


These three conclusions correspond respectively to the cases: 


(1) It is bounded. 
(2) It is unbounded and increasing. 
(3) It is unbounded and decreasing. 


Proof Consider the case where (a,)°° , is bounded and increasing. The set 
A= {a,:n €N} 


is then bounded. Note carefully that A is the set of values that occur in the sequence. 
This is shown by the use of curly brackets. Let t = sup A. We shall show that 
limy—+o0 An = ft. 

Let ¢ > 0. Since ¢ is the supremum there exists NV, such that t — ¢ < ay <t. As 
the sequence is increasing and ¢ is an upper bound, we have t — ¢ < a, < t for all 
n > N. We conclude that lim,-,.5 a, = ft. 

Consider next the case when (a,,)°° , is unbounded but increasing. We shall show 
that lim, 90 dy, = ©. 

Let K be a real number. As the sequence is unbounded and increasing it is not 
bounded above. So there exists N, such thatay > K.Butthena, > K foralln > N. 
We conclude that lim,-..5 d, = co. 

The case when the sequence is decreasing is similar and is left to the reader. 
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Example We shall define a sequence by a process called iteration, which is extremely 
important and has many practical applications. Iteration is a simple case of inductive 
definition where there is a fixed function f(x) and a,,,; = f(a,) for each n. The 
example is a very typical application of Proposition3.3. We use the fact that the 
function ./x is strictly increasing, meaning that if x < y then /x < \/y. 


Using induction we define the sequence (a,)°°., by 


ao=1, Gn =Van-1 +2 for n> 1. 


We shall show that (a,)°°.9 is increasing and convergent. 

We use induction to show that it is increasing, in fact strictly increasing. First 
of all we have a; = V3 so that ay < a,. Assume that a,_, <a, fora givenn > 1. 
Since the function ,/x is strictly increasing we have 


Gn = V/Gn_1 +2 a ee eee 


We deduce that (a,)°° 9 is an increasing sequence. 
We use induction also to show that the sequence is bounded, that in fact a, < 2 
for all n. In the first place ag = 1. Assume that a, < 2 for a given n > 0. Then we 


find 
Ant1 = VQnt+2 <V¥24+2=2, 


and conclude that a, < 2 for all n. 

By Proposition3.3 the sequence is therefore convergent, and what is more 
limp 00 An = t = SUP, 9 An. Sincea, < 2foralln wehavet < 2.Wecannot exclude 
the possibility t = 2 and later we shall see that this is indeed the case. 


3.3.1 Limits and Inequalities 


We saw in the example of the last section that a, < 2 for alln, and concluded, without 

giving any justification, that lim,_..5 d, < 2. We have to assume that equality might 

hold for the limit, even though the inequalities are strict for the terms of the sequence. 
Here is a general rule (not requiring the sequences to be monotonic): 


Let (a,)°°., and (by,)°°., be convergent sequences such that ay < by for each 


n. Then limy-s 90 An < LMy_-so0 Dn. 


Exercise Prove this and also show that if a, < b, for all n it is still possible for the 
limits to be equal. 


The rule sometimes goes under the name of preservation of inequalities. Similar 
ideas are involved in the commonly used squeeze rule, covered in the section on 
limit rules. 
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3.3.2 Exercises 


1. The following result is often used to establish a limit; in particular it is useful in 
some of the succeeding exercises, after making an informed guess about the limit. 


Suppose that for a given sequence (a,,)°° , there exist t and k, such thatO < k < 1 
and |a, — t| < k|a,_, —t| forn = 2, 3, .... Prove that limy_.o. ad, = t. 


Note. Often it happens in applications that Jay — t| < k|an—1 — t| only holds for all n from 
some index ng. This makes no difference to the conclusion. 


2. Show that the limit is 2 for the sequence (a,,)°° ) defined above by the relation 


An = /ay-1 + 2, with starting value ay = 1. 


3. A sequence (a,)°°, is defined by a,4) = i(a; + 1), with starting value a; = 0. 
Show that a, is increasing and is bounded above by 2 — /3. Deduce that the 
sequence is convergent. Then prove that the limit is 2 — /3. 


Hint. It may help to observe that 2 — V3 is a root of x = 5 (x? + 1). 


4. A sequence is defined by x,4,; = 2/(x, + 1), with starting value x, = 0. Since 
the equation x = 2/(x + 1) has only one positive root, and that is 1, the only 
reasonable candidate for a limit is 1. Prove that the limit is 1. 


Hint. It may help to observe that x, > 3 for alln > 3. 
5. The ratio x, = ad,41/d, of successive Fibonacci numbers satisfies the relation 


Xn+1 ae, 
n 


As in the previous exercise a reasonable candidate for a limit is the positive root 
of the equation x = 1+ 1/x, or equivalently, x? — x — 1 = 0. This root is the 
famous number ¢, or the Golden Ratio. Prove that the limit is @ without using 
the formula for the Fibonacci numbers obtained in Sect. 2.2 Exercise 9. 


Hint. Show that |x,41 — 6| < |xn — 6|/¢@. 


6. Define a sequence inductively by 


2X, +2 


Xn41 = —— 
Xp +27 


with starting value x; = 1. Show that lim; 2%, = a2 Compare Sect.2.2 
Exercise 3. 
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3.4 Limit Rules 


In this section we derive the most important limit rules, which enable us to find new 
limits from previously known ones, without the need to find N for each positive «. 
They can be used to calculate limits involving rational functions of n, using as input 
only the two limits given in Proposition 3.1: lim,_,.. 1/n = 0, and lim,_,.. x” = 0 
given thatO < x < 1. Actually a third limit is needed, which one may think not worth 
stating: 

lim C=C, 

n->oo 


the limit of the constant sequence with all values equal to C. 


Proposition 3.4 (Absolute value rule) Let limy-. 99 dy, = t. Then limy-+o0 |an| = It}. 
The rule holds also for t = 00 if we define |—00| = o. 


Proof We consider the case t 4 --oo, leaving the remaining cases to the reader. We 
have the inequality (see Sect. 2.2.9) 


|lan| — fall < |a, — tl. 


Let e > 0. Choose N, such that |a, — t| < e for all n > N. Then for all n > N we 
have Ilan = fall < e and we are done. 


Exercise Do the cases t = -Eoo. 


Proposition 3.5 (Sum and product rules) Let (a,)°°., and (by)?°., be convergent 
sequences, let limy oo Gn = 8 and liMy +90 by = t. Then the sequences (dy + by), 
and (ay, + b,)P°., are convergent, and 


lim a,+b,=s+t, lim a,-b,=s-t. 
noo n—>oo 


Note carefully that the sequences are supposed to be convergent, the limits oo not 
allowed. 


Proof for the Sum Let ¢ > 0. There exists Nj, such that |a, —s| < ¢/2 for all 
n > Nj; and there exists N2, such that |b, —t| < ¢/2 for all n > No. Let N = 
max(N,, N2). For alln > N we have 


€ 
2 


& 
\G@n + Bn) — (8 + £)] S lan — 8] + bn — ft] < 5 + = 6. 


We conclude that lim, d, +b, = s+, thus completing the proof of the sum 
rule. 


This argument has some features that occur quite often. Firstly, after introducing 
€ we produced an N, calling it Nj), that worked for ¢/2. This was with hindsight. 
Other multiples of ¢ might be preferred in other contexts. 
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Secondly, we produced N that worked for two sequences simultaneously. It is 
clear that we could have any finite number of convergent sequences, and find N that 
works for a given € for all the sequences at the same time. This is because if one NV 
works for a given ¢, then any integer bigger than N will also work. Moreover a finite 
set of natural numbers has a highest member. From now on, when working with a 
finite number of sequences we will usually take this trick for granted. 


Proof for the Product The convergent sequence (a,)°° , is bounded (Proposition 
3.2). Hence there exists K > 0, such that |a,| < K for all n. 

Let ¢ > 0. The most obvious way to begin is to use the definition of limit to find a 
natural number JN, such that |a, — s| < ¢ and |b, — t| < ¢ for alln > N. For these 
values of n we have 


ldn- by —S-+t| = |dn-by —Qyn-t+aq,-t—s-t| 
< |an- by —an-t| + |la,-t—s-t| 
S lanl + [On — t] + lan — 5] - [tI 
< (K + |f)/e. 


We interrupt the proof to interpose some discussion. We have arrived at the con- 
clusion that for each ¢ > O there exists N, such that |a, -b, —s-t| < Ce for all 
n > N;in this case C = K + |t|. It is clear that C does not depend on ¢; nor does it 
depend on n or N. That is why we were careful to define the constant K before we 
introduced «. It is always a good idea to introduce and define any constants, that one 
may want to use later, in the preamble, before the key phrase “Let ¢ > 0”. 

We restart the proof by backtracking to the point “Let ¢ > 0”, and choose a 
slightly different NV. There exists N, such that |a, — s| < ¢/C and |b, — t| < e/C 
for all n > N. For the same N we have |a, -b, —s-t| < e¢ for all n > N. This 
concludes the proof. 


Of course the backtracking requires the benefit of hindsight. Since we know that 
it can be done we can scrap it altogether in our proofs. From now on we will be 
content, in proving that lim,_,.. a, =f, to find N for each given ¢ > 0, such that 
for all n > N we have |a, — t| < Ce, provided we have made it clear that C is 
independent of ¢, n and N. 


Proposition 3.6 (Reciprocal rule) Let (a,)"2,be a convergent sequence, let 
limy—+oo An = t and assume that t # 0. Then 


lim — = -. 
N>O Ay t 


Before we give the proof some explanatory discussion is needed. The terms 1|/a, 
form a sequence in the following sense: there exists ng such that a, A Oforalln > no 
and the reciprocals form a sequence (1/a, os In fact we know that limy-. 6 |a,| = 


|t| and |t| > 0. Hence there exists mo, such that for all n > no we have |a,| > S|tl, 
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and therefore also a, 4 0 (we are taking ¢ = 5|t| here). The reciprocal 1/a,, is then 
defined for all n > no. In this way we sidestep the possibility that a, may be 0 for 
certain place numbers n. 


Proof of Proposition 3.6 We have 


1 I t-—a, 


a, t t- dy 


Let ¢ > 0. There exists n; such that |a, — t| < e for all nm > n,. We have seen that 
there exists mo, such that |a,| > S|t| for all n > no. Let N = max(no, 11). For all 
n > N we have 

1 1 


an t 


|t — a,| 2 
= ze: 
It]-|an| [th 


The multiplier 2/|t|? is independent of n, N and ¢. We conclude (see the discussion 
following the proof for the product rule) that lim,.55 1/a, = 1/t. 


By using the rule for product and the rule for reciprocal we obtain the rule for 
quotient. 


Proposition 3.7 (Quotient rule) Let convergent sequences (ay,)°~, and (bn), be 


n= 


given. Let limy +o An = S and iMy-+o0 Dy = t, and assume that t 4 0. Then we have 


2 
1 
Example Prove that lim de 3 exists and find it. 


oo 2n~ + 


We shall write the argument in excruciating detail referencing all rules. First we 
have 


1 

1+— 

n+d n2 
2 = 30 
2n- +3 phi 
ne 


We know that lim,...1/n =0. By the product rule lim, 1/n? =0 and 
limy—o0 3/n? = 0. By the sum rule 


1 3 
lim (: + =) =1 and lim (2+ =) =2: 
noo n noo n 


Finally by the quotient rule 


3.4 Limit Rules 55 


Actually the quotient rule tells you that the limit exists, as well as yielding its 
value. Usually in calculations using the rules of this section, we take the existence 
of the limit for granted, knowing that it is guaranteed by the rules being used. After 
a bit of practice most of the above steps can be carried out mentally. 

Next we have the squeeze rule.! 


Proposition 3.8 (Squeeze rule) Suppose that (a,)°_1, (Dn )po 1, (Cn) po, are Sequences 
and that 
Gn Sen Sdn 


for all n. Suppose that (a,)°2, and (b,)°°, are convergent with the same limit, 
LiMy 500 An = WMy-5.99 Dy = t. Then limy_s 99 Cy = t. 


Proof Let ¢ > 0. There exists N, such that |a, —t| < e and |b, —t| < e for all 
n > N, (again, two sequences, same NV). Hence, for all n > N, we have a, > t — 
€é and b, < t+, from which we find that t — e« < c, < t+. We conclude that 
limy-so0 Cn = ft. 


For the case of squeezing with infinite limits, the following easily proved rules 
can be used. 


Suppose that a, < b, for eachn > 1.If lim a, = © then lim b, = 6; and 
n—->oo noo 


if lim b, = —oo then lim a, = —o0. 
noo noo 
Exercise Prove the squeeze rules with infinite limits. 


Example We assume the reader is familiar with the function sin x. All we need here 
is the fact that | sin x| < 1 for all x. Set a, = sinn/n. We have 


We conclude that lim,_,.5 |@,| = 0, which is equivalent to lim,_,.5 a, = 0. 


3.4.1 Exercises 


1. Prove some rules involving infinite limits: 


(a) Suppose that lim, 5 dy, = t, liMy_+o0 Db, = C6 and tis a finite number. Then 
limps oo An + bn = CO. 

(b) Suppose that limy—o dy = ¢, LiMp+o0 Dy = © and ¢ is a finite number, but 
is not 0. Then lim, 45 Gyb, equals oo if t > 0 and —co ift < 0. 

(c) Suppose that lim,_,.. d, = oo. Then lim,-,.5 1/a, = 0. 


' Also known as the sandwich principle, or the two policemen and a drunk rule. 
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. Let a and b be positive constants. Find limy-, oo 


. Find lim, ...(n!)!/". 
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(d) Suppose that lim,_, 4. a, = 0. Is it the case that lim, 1/a, = oo? Explain 
what happens. 


. Find the limit 


my (Ont Sn + 2)(4n + 3)Qn + 4)Qn + 5)(n + 6) 
noo (n — 1(n + 7)(n — 11)(n + 15)(n — 21)(n + 101) 


. Let (a,)°, be the sequence of Fibonacci numbers (that is, a; = a2 = 1, a, = 


Gn—1 + An—2 for n > 3). Compute the limit limp. dn41/d, using the formula 
for a, derived in Sect. 2.2 Exercise 9. 


. Let the sequence (a,,)°°., satisfy the recurrence relations 


an = Ban) +YGQn-2, n= 345s 


where a and az have prescribed values. Assume that the second-order equation 
2 — Bd — y has two real roots r; and rz, and that 0 < |r| < |r|. Show that 
limp oo An+1/Gn = 12 except in the case when a2 = rd}. 


. Let a and b be positive constants. Find lim,_,.5(a” + b")!/". 


q" — pb” 
a” + pr : 
Hint. First show that if 1 < k <n then 
1 1k 1 
(n!)" > (kK+1) 7 “(k!)". 


You might find Sect. 3.2 Exercise7 useful. 


. Find lim, ..n—-—J/(n+ta)(n+b). 


. Let (h,)°2, be a sequence of positive numbers such that limy_, 4 hy = 1. Let a 


n=1 
be a non-zero rational. Prove that lim,.. hf = 1. 


Hint. You will need the laws of exponents for rational powers (Sect. 2.2 Exer- 
cise 19). By the reciprocal rule one may assume that a > 0. If @ is an integer it’s 
easy. If a is not an integer one may reduce it to the case 0 < a < 1 by subtracting 
an integer and using the product rule. Now use the squeeze rule, observing that 
if0 <x < 1thenx < x* < | whilstif 1 <x then 1 <x% <x. 


Recall the inequality of arithmetic and geometric means proved in elementary 
algebra. For positive real numbers a and b this states that 


vab<**° 


with equality if and only if a = b. 


3.4 Limit Rules 57 


Let a and b be positive and distinct. Define sequences (a,)°29 and (b,)P-9 
recursively by 
a=da, bo =b 
an41 = 5 (An + bn) ’ ba+t _ Andy 


forn = 0, 1, 2,.... 


(a) Show that 
by, < Dn+1 < adn+1 < Gn 


for alln > 1. 
(b) Show that the limits lim, 5 a, and lim,-. 95 b, exist and are equal. 
Note. The common limit is called the arithmetic-geometric mean of a and b and is some- 


times denoted by M(a, b). It was studied by Gauss and has some surprising applications in 
computation theory. It will be revisited, in Sect.5.2 and again in Sect. 11.2. 


3.5 Limit Points of Sets 


In this section we shall study an important property that pertains to completely arbi- 
trary subsets of R. We shall also exhibit a second important application of supremum. 
The distinction between finite and infinite will play an important role in our consid- 
erations. 


Definition Let A be a subset of R. A number ¢ is called a limit point of the set A if 
the following condition is satisfied: 


For each ¢ > 0 there exists x € A, such that x # t but x lies in the interval 
Jt —e,t+el[. 


Do not confuse limit point of a set with limit of a sequence. A set may have many 
limit points, or none. Do note the following points listed here: 


(a) A limit point ¢ of A is not necessarily in A, though it may be in A. 

(b) The condition that x lies in the interval ]t — ¢, tf + e[ and is not equal to t may 
be written as the inequalities 0 < |x —?t| <e. 

(c) We can express the definition using set theory: for each ¢ > 0 the set 


An (It—er+eL\ (9) 


is non-empty. 
(d) It is easy to see that the set in the last item, if non-empty for every e > 0, must 
be infinite for every e > 0. 


Exercise Prove this claim. By way of a hint: if the set is finite for a certain ¢, what 
can you say about the set of numbers |x — t| for which x is in the set? 
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(e) Informally, ¢ is a limit point of the set A if we may approximate f by points in 
A, that are distinct from ¢ if it happens that ¢ is in A. 


3.5.1 Weierstrass’s Theorem on Limit Points 


In this section the predicates “finite” and “infinite” will be used in proofs. We need 
to be a little clearer about their precise meaning without going too deeply into set 
theory. 

As a working definition of finite set, we can use the following. 


Definition A set A is finite if either it is empty, or, if not empty, there exists a natural 
number WN and a sequence (a,)_,, such that A is the set {a, : 1 <n < N}. 


n=1? 


The set {a, : 1 <n < N}, as the curly brackets indicate, is just the set of values 
appearing as terms of the sequence (a,)_,, ignoring all repetitions. To conclude that 
the set has N elements we must assume that the terms of the sequence are distinct. 
Children know that you have to be careful not to count the same sweet twice. 

We could of course incorporate into the definition of finite set the requirement 
that the terms of the sequence are already distinct. However, it should be obvious 
that if A is finite according to our definition, and not empty, then there exists some 
natural number L < N, such that A can be presented as a sequence with distinct terms 
and index set {1, ..., ZL}; just proceed from left to right throwing out repetitions. The 
number L, the cardinality of A, is uniquely defined by A, a fact that should be proved 
if this was arigorous text on set theory, but which we shall simply accept as intuitively 
obvious. 

There is then little mystery about the following notion. 


Definition A set is infinite if it is not finite. 


Thankfully, we have a plentiful supply of infinite sets: for example N, R, the 
intervals Ja, b[ for a < b, and loads of sets formed from these using set-building 
operations. This is just the start. 

A number of properties pertaining to the dichotomy of infinite set versus finite set 
will be frequently used. Maybe they are obvious, but it is useful to list them here. In 
a proper account of set theory they are theorems. 


(a) A subset of a finite set is finite. 
(b) If A is an infinite set and B is a set such that A C B then B is infinite. 
(c) The union of two finite sets is finite. 


Some useful, and equally obvious (or otherwise) facts follow from these two. 


(d) If A is an infinite set and B is a finite set then the set difference A \ B (the set 
of all elements of A that are not in B) is infinite. 
(e) The union of a finite number of finite sets is finite. 
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The following result is sometimes called Weierstrass’s theorem on limit points. We 
will use it only once: to prove the Bolzano—Weierstrass theorem (Proposition 3.10), 
and, even so, another proof is suggested in the exercises that does not depend on 
Weierstrass’s theorem. The dichotomy of finite set versus infinite set is essential 
here, and we apply it to completely arbitrary subsets of IR and not only to sequences; 
Cantor showed us that these notions are different, there being sets of real numbers not 
expressible as sequences. Going beyond sequences might put Weierstrass’s theorem 
outside fundamental analysis, except that there is a long tradition of including it, 
going back to Hardy’s “A Course of Pure Mathematics”. 


Proposition 3.9 Let A be an infinite but bounded set of real numbers. Then A has 
at least one limit point. 


Proof Define a subset B C R using the specification: 
B= {x ER: the set of all y € A, that satisfy y < x, is finite}. 


Now B is not empty; it contains all lower bounds of A and such points exist. And 
B is bounded above, since, A being infinite, all upper bounds of A (and such also 
exist) are also upper bounds of B. Hence the supremum ¢t = sup B exists. We shall 
show that ¢ is a limit point of A. 

Let ¢ > 0. Then ¢ + « is notin B so that infinitely many elements of A are below 
t + €. On the other hand there must exist an element of B in the interval |t — «, ft] 
(for otherwise it would not be true that f = sup B). By rule (a) in the list preceding 
the proposition, we see that at most finitely many elements of A are below t — &. 
By rule (d) we the see that infinitely many elements of A must lie in the interval 
jt — e,t + e[. At least one of them is not the same as f. 


The proof was a typical use of supremum, and a particularly elegant one. We want 
to show that the set A has a limit point. The definition of limit point refers to a point 
t, SO we need to conjure up a point on which to test the definition. It is supremum 
(or in other cases infimum) that does the conjuring, providing a likely candidate for 
limit point. 

Compare this to the slightly simpler case of Proposition 3.3. A candidate for the 
limit of a sequence was needed; it was again provided by supremum. Ultimately it 
is axiom C1 that guarantees us that elements of R, the existence of which we need, 
do in fact exist. As a matter of fact the set B in the proof of Proposition 3.9 is the left 
set of a Dedekind section, so we could have produced the point ¢ by a direct appeal 
to axiom Cl. 


3.5.2 Exercises 


1. Show that the limit points of an interval are all the points of the interval as well as 
the endpoints (whether or not the endpoints are in the interval); the only exception 
being the degenerate interval [a, a] = {a}, which has no limit points. 
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2. Find all limit points of the following subsets of R: 


(a) N 

(b) Q 

(c) The set of all rationals of the form a/10", where a is an integer and n a 
natural number. 

(d) R \ Q (the set of all irrationals) 

(e) The set A, where A is finite. 

(f) The set R \ A, where A is finite. 


3. Let k; and kz be real numbers, neither 0. Suppose that k;/kz is irrational. Let A 
be the set of all real numbers that can be written as mk; + nk2 for some integers 
m and n. Prove that every real number is a limit point of A. 


Hint. See Sect.2.5 Exercise 3. 


4. Show that if ¢ is a limit point of a set A of real numbers, then there exists a 
sequence (a,,)°°_, of points in A, all distinct from ¢ (if ¢ should be in A), such that 
limy—+o0 An = ft. 


Hint. Apply the definition of limit point to a sequence of ¢’s of the form 1/n. 


Note. Actually a new axiom of set theory, the axiom of choice, is required to build the sequence 
for aset A inall generality. The use of this axiom goes beyond the scope of this book and the need 
for it has usually not bothered analysts. We shall rarely mention it. For most sets that we shall 
consider here, such as intervals, or finite unions of intervals, the sequence can be constructed 


more or less explicitly. 


3.6 Subsequences 


Let (a,)°°, be a sequence of real numbers. Let (k,)°°, be a strictly increasing 
sequence of natural numbers, that is, k, < k»+, for each n. The sequence (ax, )°°., is 
called a subsequence of the sequence (a,)7°.1- 

Thus starting with the sequence of natural numbers (n)°°, we can form the 
sequence of even natural numbers by taking k, = 2n. We can form the sequence 
of primes by taking k, = zn, the latter being a common symbol for the n" prime. 
Clearly there is immense freedom to construct subsequences of a given sequence. 

We have seen that a convergent sequence is always bounded, but that a bounded 
sequence is not always convergent, although it is if also monotonic. About an arbitrary 
bounded sequence we have the following proposition, the start of a story that extends 


far into analysis and topology. 


Proposition 3.10 (Bolzano—Weierstrass theorem) Every bounded sequence of real 
numbers has a convergent subsequence. 
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Proof Let the real number sequence (a,)7°., be bounded. If the same value ft 
appears infinitely often in it, so that for example a,, = t for ky < ky <k3 <---, 
then limy_, oo a, = t and we have a convergent subsequence. 

Assume next that no real number appears infinitely often in the sequence. Then 
the set of all values in the sequence, let us call it A, is an infinite set of real numbers. 
(if that is not obvious, try to prove it using the properties of finite sets and infinite 
sets given in the last section.) The set A is bounded, and so has a limit point ¢ (by 
Proposition 3.9). We construct a subsequence with limit ¢ using induction. Find an 
index k, such that 


0 < |t-—a,| < 1. 


Suppose that we have found an increasing sequence of natural numbers k,, ka, ..., Kn 
such that 
O<|t—a,|< 1/7 


for j = 1, 2, ...,. There exists an integer, higher than k,, which we can call kyn+1, 
such that 
0 < |t—ay,,,| < 1/a +1). 


The subsequence (a, )°°. , thus constructed converges to f. 


This proposition is immensely important. We do not want any mistakes in the 
proof. How could we be sure in the first step, when ¢ occurred infinitely often in the 
sequence (a, )r° ,, that the sequence (k,)°_, really existed? In the induction argument 
of the second step, how could we be sure that k,+; really existed? 

This is one of those cases when there are hidden appeals to Proposition 2.1. In 
the first step there is an infinite set of natural numbers B, comprising the set of all k 
such that a, = t. We need to arrange B as an increasing sequence (k,)°° , of natural 
numbers. We take k; as the lowest member of B, then kz as the lowest member 
after removing k,, then k3 as the lowest member after removing k; and kz, and so 
on. Because B is infinite we never empty it in a finite number of steps; there are 
always some numbers remaining and we can choose the lowest as the next term in 
the sequence. This can be expressed formally by the inductive definition: 


Kn41 = min(B \ {k1, sey Kn}). 


In the second step, because t¢ is a limit point of the set A, we know that for each 
€ > 0 there exists x in A, that satisfies 0 < |t — x| < e. In particular there exists x 
in A that satisfies 0 < |t — x| < 1/(n + 1). But we also want x to have an index 
higher than k,,. Consider the numbers |t — a,,| as m ranges from | up to k,. Some of 
these may be 0, whilst others are certainly non-zero, for example |t — a;, |. Choose 
€ > O smaller than 1/(m + 1) and smaller than all those numbers |t — a,,| which are 
non-zero and for which m < k,, (there are only finitely many of these). We could give 
a formula for ¢ using the min-function, but it would not be very readable. The set 
of natural numbers m such that 0 < |t — a»| < € is not empty (because f is a limit 
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point of A) and all such m satisfy m > k, (because of the way we selected ¢). We 
can choose the lowest of them as k,+1. 

Of course describing the proof in such fine detail might be thought rather pedantic, 
but as we defined a sequence as an assignment of terms to the natural numbers it may 
seem wise at least once to describe the assignment, especially where the stakes are 
so high. We can in future omit these details since for most purposes the first version 
we gave of the proof is quite sufficient. But it is worth reflecting on the mistakes that 
the great mathematicians of the past have made in analysis, so we should not allow 
ourselves to become too complacent. 

Proposition 3.10 is often called the Bolzano—Weierstrass theorem. Another proof 
of this proposition depends on showing that every sequence has a monotonic sub- 
sequence. This bypasses the use of Weierstrass’s theorem on limit points, but still 
depends on the dichotomy of finite set versus infinite set (see the exercises). 

Another important use of subsequences arises in the negation of the statement 
limy-+oo Gn = t, often needed for constructing proofs by contradiction. Recall that 
the negation is equivalent to saying that there exists ¢ > 0, such that for every N 
there exists n > N, such that |a, — t| > e. 

We can go further for the ¢ in question. We can produce a subsequence (a,,) °°, 
such that |a,, — t| = ¢. Consider the set of alln greater than N such that |a, — t| > «. 
The whole point is that this set is not empty. We can therefore assign to N the 
lowest number k (using here Proposition 2.1) greater than or equal to N for which 
|a, — t| > e. We use this assignment to define the subsequence (a,, )°° , inductively. 
We start at N = | and assign k,. Having assigned k, we reset N tok, + 1 and assign 
k,41, and so on. The result of this discussion is as follows: 


Proposition 3.11 The negation of the statement limy-.o0 An = t is equivalent to the 
following: there exists ¢ > 0 and a subsequence (a,,)°°_,, such that |a,, — t| = € for 
all n. 


3.6.1 Exercises 


1. Prove that every sequence has a monotonic subsequence. Use this to give another 
proof of the Bolzano—Weierstrass theorem. 


Hint. Consider the set of all integers n with the property that a,, <a, for all 
m > n, and reflect on the consequences of its being finite or infinite. 

2. Show that if a sequence (a,)°° ; is not bounded above then there is a subsequence 
(ay, Joo. such that limy—. 0 ay, = oo. A similar result holds if the sequence is not 
bounded below, but —oo replaces oo. 

3. Prove the following proposition. Suppose a sequence has the limit t (which may 
be oo). Show that every subsequence also has the limit f. 

4. Prove a converse to the result of the previous exercise, with a twist. Assume that 


every subsequence of the sequence (a,)°°_, has a limit, but we do not assume that 
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they all have the same limit. We also allow oo and —ow here as limits. Prove that 

the sequence (a,,)°., has a limit. 

5. Let (a,)°°., be a sequence of real numbers and suppose that the set of values 
A = {a,:n=1,2,...} appearing in the sequence has a limit point t. Show that 


there exists a subsequence (ax, )°°.,, possessing distinct terms, that converges to f. 


n=1? 


3.7 Cauchy’s Convergence Principle 


We introduce a condition that is both necessary and sufficient for a sequence to be 
convergent, but does not mention a candidate for the limit. We can identify whether 
or not a sequence is convergent without going outside it; by studying in fact the 
terms alone. The condition has important, mainly theoretical, applications throughout 
analysis and allied subjects, and will reappear in this text in the context of limits of 
functions. 


Proposition 3.12 (Cauchy’s convergence principle) A sequence (a,)°°, of real 
numbers is convergent if and only if it satisfies the following condition (Cauchy’s 
condition): for all ¢ > 0 there exist a natural number N, such that for all m > N 
and n > N we have |am — ay| < &. 


Proof As is often appropriate when proving that a condition is necessary and suffi- 
cient, we split the proof into two parts. 

(a) Cauchy’s condition is necessary for convergence. Assume limy-,50 a, = t. Let 
€ > 0. Choose N, such that |a, — t| < e/2 foralln > N.Ifnown > N andm > N 
we have 


1 1 
lan = An | = lan t| ac |t an|\ = cha oh = 6. 


(b) Cauchy’s condition is sufficient for convergence. Assume that Cauchy’s condition 
is satisfied. First we show that the sequence (a,)7° , is bounded. We let ¢ = 1, find 
a corresponding N and let m = N in Cauchy’s condition. For all n > N we have 
lan — a,| < 1, which gives |a,| < |ay| + 1. That is to say, all terms of the sequence, 
except perhaps a finite number, satisfy |a,| < K where K = |ay|+ 1. 

By the Bolzano—Weierstrass theorem (Proposition 3.10) there exists a convergent 
subsequence, (aj, )P°.,, and we let t = lim). ax,. Combined with Cauchy’s condi- 
tion this forces a,, to converge to t. For lete > 0. Choose N, such that |a,, — a,| < €/2 
for alln > N and m > N. We can find a term in the convergent subsequence, for 
example Ak; with index k; > N, and which satisfies lax, —t| < ¢/2. But then for 
alln > N we have 


1 1 
lan — t| < |an — ax,| + lax, — t| < soe = 


This ends the proof. 
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2 

The importance of Cauchy’s principle for analysis and its further developments 
is such, that it is desirable to exhibit Cauchy’s condition, separately from Proposi- 
tion 3.12; for the reader’s closer perusal: 


Cauchy’s condition. For all ¢ > 0 there exist a natural number N, such that 
forallm > N andn > N we have |am, — an| < &. 


Cauchy’s condition requires that |a,, — a,| should be smaller than ¢, only in virtue of 
m > N andn > N. The separation of m and n can be vast, in fact there is no upper 
limit on it. It is acommon beginner’s error to think that Cauchy’s condition is satisfied 
if, for each ¢ > 0, there exists N, such that |a, — ay,41| < ¢€ for alln > N. One can 
rephrase the condition throwing more emphasis on the arbitrariness of m — n. This 
leads to a useful alternative formulation: 


Cauchy’s condition, second version. For every € > 0 there exists a natural 
number N, such that for all n => N, and for all natural numbers p, we have 
lQntp — An| < €. 


3.8 Convergence of Series 


In all mathematics there is not a single infinite series whose convergence has 
been established by rigorous methods. (Letter from N. H. Abel (1828)) 


The correct definition of convergence of the infinite series }“7° , ax is one of the 
main achievements of analysis and it dispersed a great deal of nonsense that had 
beset mathematics. 


Let (a,)°° , be a real number sequence and let s, = aa ay forn = 1, 2,3, .... 


Definition Ifthe sequence (s,)°° , is convergent and limy_,o0 5, = t, we say that the 


infinite series )°7° , ax is convergent, and write 


[ee 
So ax =f. 
k=1 


We call t the sum of the series. The numbers s, are called partial sums. A series 
that is not convergent is said to be divergent. 

Note that ¢ is supposed to be a finite number when the series is convergent. If 
limn+0o Sn = 00 OF limy +0 Sn = —OO We write )-p-, ag = 00 or DP ag = —00, 
but the series in both these cases is divergent. 

A series can be formed by starting at other indices thank = 1, forexamplek = N. 
Then to say )\¢ y a = t means that the partial sums sy, = )~;_y ax have the limit r. 
It is then easy to see that 
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(oe) 


N-1 lee) 

~ ~ = 

» a= ) a+) ak, 
k=1 k=N 


k=1 


given that either of the two infinite series appearing in this equation is convergent. 


3.8.1 Rules for Series 


Some simple facts about series follow easily from the limit rules for sequences given 
in Sect. 3.4. The reader is invited to supply the proofs. We obtain important and useful 
rules for manipulating series. 


(i) (Multiplication by a constant) Let ar a; be a convergent series and let its sum 
be t. Let a be a real number. Then the series }°7° , aa, is convergent and 


CO 
) ad, = at. 
k=1 


(ii) (Sum of two series) Let yee by be a second convergent series, and let its sum 
be s. Then the series ae (ax + by) is convergent and 


oe) 
Saath) =tts. 
k=1 


3.8.2 Convergence Tests 


A big part of the theory of infinite series consists of the so-called convergence tests. 
These enable us to establish that a series is convergent (or in some cases divergent) 
by examining the sequence of terms, but without proposing a candidate for the sum. 


Proposition 3.13 /f pa 1 a is convergent then limg—o0 ax = 0. 


Proof Let s = Y°7°., ax and let s, = S°7_, ax for each n. Then lim, oo 5, = s but 
we also have limy_,.9 5,1 = s. It follows that 


lim (S, — Sp-1) =s —s =O. 
n—->Co 


But 5, — Sy_1 = Gn, SO that limy-, 45 A, = 0. 


The condition limy-. 50 az = 0 is, in view of Proposition 3.13, a necessary condi- 
tion for convergence of the series. It is far from being sufficient. Proposition 3.13 is 
thus a divergence test and a useful one. If lim,_,.. a, does not exist, or if it exists 
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but is not 0, then the series ar a, is divergent. But if lim, a, = 0, one cannot 
conclude from that alone that )~7<, ax is convergent (it is acommon beginner’s error 
to think otherwise). 


3.8.3 The Simplest Convergence Tests: Positive Series 


A series pee dx is Said to be a positive series if a, > 0 for each k. The partial sums 
s, then form an increasing sequence. By Proposition 3.3 an increasing sequence is 
either convergent or it tends to oo. So we have the following basic results. 


Proposition 3.14 Let )°?°., a, be a positive series. Then it is convergent if and only 
if there exists K > 0, such that ¥~)_, ax < K for each n. 


The proposition can be paraphrased loosely by saying that a positive series is con- 
vergent if and only if its partial sums are bounded. 


Proof The sequence of partial sums is increasing and so is convergent if and only if 
it is bounded above. 


Proposition 3.15 (The comparison test) Assume that )°°°, a, and \°°-, by are 
positive series and ax < bx for each k. Then we have 


(1) If oe, be is convergent then ~r~ , ax is also convergent. 
(2) If \-22, ax is divergent then )~?~., by is also divergent. 


Proof We use the inequalities 


n n CO 

~ ~ ~ 
> SD de SD de, 
k=l k=i pai 


in which the third member could be oo. If }°72, by is convergent then the sums 
> 1 % are bounded above by the finite number }°?°., bg, and the series )°7~ ; ax is 
therefore convergent. If }>7° , a, is divergent then the sums }~°7_, ag are not bounded 
above (they tend to oo), and so the sums )~y_, by; also tend to oo. 


Proposition 3.16 (Limit comparison test) Given positive series S °°, ax and 
ae by, in which no terms are zero, we assume that the limit € = limg-+oo ag [bx 
exists and satisfies 0 < € < o&. Then either the series ar ay and yy b;, are both 
convergent or they are both divergent. 


Proof Assume that er b; is convergent. There exists NV, such that for allk > N we 
have a; /by < €+ 1,and therefore alsoa, < (€ + 1)bx. But the series 4 (€+ 1b; 
is convergent so that, by the comparison test, the series }°7~ , ax is also convergent. 

Next assume that 4 a, is convergent. We note that limo bg /ax = 1/€ and 
use the same argument. 
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3.8.4 Geometric Series and D’Alembert’s Test 


In order to use the comparison test we need some series, whose convergence status 
is already known, to use as a yardstick. 

A series par ax is called a geometric series, if there exists r independent of k, 
such that a, = ra,_; for each k. In short, the ratio a, /a,_; is constant, provided no 
term is zero. It is convenient to write a geometric series with the starting index k = 0. 
Then a, = r*ao and the series has the form 


> aor’. 
k=0 
From algebra we know that 
_ n+l _ 
k= reer ifr Al 
k=0 n+1 ifr = 


Exercise Prove this formula. 
The formula for the sum of a convergent geometric series is a basic result generally 
taught in school mathematics. 


Proposition 3.17 Assume that ay 4 0. Then the geometric series )-~- 9 aor* is con- 
vergent if and only if |r| < 1. In this case its sum is 


CO * 
k do First term 

Ye aor = 1 = 1 fi . 

rar —r — ratio 
Proof If |r| < 1 we have that 

a ‘ prtl _ 4 

= aor) = ag —— 
r—1 
k=0 


and the limit is ag/(1 — r). If |r| > 1 the term aor” does not converge to 0; hence 
the series diverges. 


Proposition 3.18 (The ratio test or D’Alembert’s test) Let sas a, be a positive 
series in which no term is 0. Assume that the limit 


. Ak+1 
t:= lim —— 
k>oo Ag 


exists. Then 


(1) The series is convergent if t < 1. 


68 3 Sequences and Series 


(2) The series is divergent if t > 1. 
(3) There is no conclusion if t = 1. 


Proof (1) Assume that t < 1. Choose a number r such that t < r < 1. There exists 
N, such that ag41/a, <r for all k > N. For such k we have that ay.) < ra,, from 
which we find a, < ayr‘~% fork = N, N +1, N +2 and so on (a sound proof can 
be made by induction from k = N). But now the series }“7° y ax is convergent as we 
see by comparing it to the convergent geometric series }7>~ y ayr*—. We restore 
the missing terms a),...,a@y—; and conclude that p See a, is convergent. 

(2) Assume next thatt > 1. Then there exists N, such that az,;/a, > 1forallk > N, 
and so a, is increasing for k > N and cannot have the limit 0. 
(3) See below. 


Tf limg_. 0 G41 /ag = 1 no conclusion can be obtained from the ratio test. We 
included this claim in the statement of the proposition, to provide some necessary 
emphasis, for it is intended to be used as a test and to be referenced as such. Although 
item 3 may appear to be an exceptional case, we are forced to come to grips with 
it. This will be abundantly clear from the material later in this chapter, and more 
especially in the study of power series in Chap. 11. 

If the problem is that the limit does not exist, then it is sometimes possible to use a 
more delicate version of the ratio test that does not require the limit. Or else it may be 
possible to use Cauchy’s root test, which involves calculating the usually rather hard 
limit lim, soo dy/”. But if limy_+o0 ax4.1/az = 1 some other test is needed. There are 
many such tests known, thanks to the labours of nineteenth century mathematicians; 
for example Raabe’s test or Gauss’s test. These topics will be touched upon in 
Chap. 10. 

Generally the ratio test is the first thing to try when faced with testing a series for 
convergence. 


3.8.5 Exercises 


1. Test the following series for convergence: 


(oe) 


(a) > n2-" 


n=0 
oe) 


(b) = n'009-n" 


n=0 
foe) 


(c) pe mas 


n=0 


= | 
(d) ae, 


3.8 Convergence of Series 69 


a). 
an = 14+- 
n 


forn = 1, 2,3, .... Show that a, is increasing and that for each n we have 


2. Let 


Deduce that the limit 


exists and is a finite number. 
3. Continuation of the previous exercise. Show that for each m < n we have 


(+2) e144 So ae eater 


Deduce that 


4. Draw conclusions for the following series using the ratio test. The conclusions 
may depend on the number x. You may assume that x is positive. 


(oe) 


(2n + 1)!x” 
(a) a eee 
2 (n!)2 
O on 
(o) > ~ 
n=0 
© (2n + 1)1(3n)!x" 
(c) eS 
2 (n!)° 
) So 2743" x" 
n=0 
(e) a gh x", (a, b, c positive constants). 


om b(b+c)...(b + nc) 


5. Test for convergence the series 


y (4k)! 26390k + 1103 
(k!)4 3964 


k=0 
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It is a result of S. Ramanujan that the sum of this series is 


9801 1 

2/2 1 
Using a “hand-held” calculator approximate wz using first one term, then two 
terms, of the series. The results are astonishing. 


3.8.6 The Series )-7-_,1/n? 


Let p > 0. The ratio test gives no conclusion for the series }*-° , 1/n” since the 


quotient tends to 1. This type of series is sometimes called rather quaintly a p-series. 
We are going to estimate the sum directly. 

We could straight away suppose that p is rational. We defined rational powers in 
2.2, without giving the details, and we need the laws of exponents (y“)? = y@ and 
yty? = y“+, The fact that the quotient tends to 1, that is, 


nP 


lim -——— = 1, 
n>oo (n + 1)P 
follows from Sect. 3.4 Exercise 9. 
The conclusions also hold for real p but proving them requires a definition of real 
powers, which will be fully covered together with the laws of exponents in Chap. 7. 
Consider the terms fromn = 2* to 2+! — 1; there are 2" of them and they decrease 
with increasing n. Therefore we have 


k+l _y 
a 
ne (2k)? Qk(p-l) ° 
n=2k 
It follows that 
gntt_y 1 N 1 
paler 
nP a Dk(p-l) 
n=1 k=0 


On the right is a geometric series with ratio 2'~-”. For p > 1 it is convergent and 
this implies that the sums on the left-hand side are bounded above independently 
of N. Since the terms are positive it follows that the partial sums oer 1/n? are 
also bounded above independently of N. We conclude that the series )°° , 1/n? 
converges if p > 1. 


We also obtain the estimate 
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Exercise Prove the estimate. 


In particular we find that )°°° , 1 /n? < 2. Euler solved the Basel problem in 1735 


by showing that 
2 


ane | 4 
> a= ee 1.6449... 


Some very crafty proofs of this are known using only fundamental analysis, but it 
is best proved using Fourier series, which also yield formulas for )>*~, 1/n? in the 
cases when p is an even number. 

When p = | we have the harmonic series )~° , 1/n. The n™ term tends to 0, 
but we cannot deduce convergence from this. We estimate the terms for n = 2* to 
2'+1 _ 1 from below and obtain 


Qk+1_4 : 1 1 
oe 
= n Dk+1 2 
and so 
Qn+l 
ys 1 N " is 
n 
n=1 


The sums on the left therefore tend to infinity with increasing N and we conclude 
that the harmonic series )°~ , 1/n is divergent. We even have an estimate of the size 
of the partial sums, though it greatly underestimates the rate of growth, which is, 
even so, rather small. 

In the case 0 < p < | the series )°~. , n~? diverges by the comparison test, since 
nes in: 


3.8.7 Telescoping Series 


This method can be used for series that are not necessarily positive. Given the 
sequence (a,)¢°., one can sometimes find another sequence (b,)72., such that 


an = by — bes, k= 1, 2, 3, se 


Then we have 


Ya bs — bagi) = bt = bagi 


k=1 
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The series ee a, is therefore convergent if and only if the limit lim, .5 by, exists, 
and if so then 


oe) 
y ay = by — lim by. 
et noo 


3.8.8 Exercises (cont’d) 


6. Examine the following series for convergence: 


10. 


11. 


“n(n +1) 
” 2 (n +272 
“n(n +1) 
o 2 (n +2) 
“n(n + 1) 
© Late 


n=1 


3 I 
(d) aoe 


. Let p be a natural number. Using the telescoping series 


care | 1 
ae aren, 


n=1 


as a comparison series give another proof that the series )->° ; 1/n?*! converges. 


. Using the method of the previous exercise, but taking p = 5, give another proof 


that the series )°°° , n~3/* converges. 


. Examine for convergence the series 


3 (n+ 1)(n4+2) 
n/n ; 


n=1 


Find N such that ee , 1/n > 100. It does not have to be the smallest N that 
works; that can be found by methods explained in the final chapter. 
Let 


Express in terms of s the sums of the series 
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(oe) CO 


1 1 ie Os 
ae 2 Gna pa ne - 


n=1 n=1 


The calculation would include a proof that the third series is convergent. 

12. Prove Cauchy’s condensation test. Let °° ; a, be a positive series such that the 
terms a, form a decreasing coclarets Then the series )~°~ , a, is convergent if 
and only if the series }°°°_, 2” a2» is convergent. 

Hint. The method we used to study the series }°°° , n~? was essentially Cauchy’s 
condensation test. 

13. Use Cauchy’s condensation test to study the series )°™ , n~? Inn. 


Note. The function In x is the natural logarithm of x and will be properly defined in a later 
chapter. Many readers will be familiar with it from school algebra. The only thing you need 
to know here is the formula In(2”) = n In2. 


14. What can be said about the series p Seat —,n P(dnn)?? 
15. Prove the following theorem of Abel. Let the positive series )°>~ , dn be conver- 
gent and assume that the sequence a, is decreasing. Then lim, Nay, = 0. 


Note. This is a necessary condition for convergence that can stand beside Proposition 3.13 and 
settles the harmonic series. 


16. As we have seen, the harmonic series }*~ , 1/n diverges. Suppose we remove 
all terms for which the decimal representation of n includes the digit 9. Show 
that the resulting series converges. 


3.9 Decimals Reprised 


We now give an exact treatment of decimals based on infinite series. We consider a 
real number x in the interval 0 < x < | and study its decimal representation. By this 
is Meant a representation as the sum of an infinite series 


di ad ad Sale 
eh : os 3.1 

*= 707 102 * To * = Lit 10% oa 
where each coefficient d; is one of the natural numbers 0, 1, 2, 3, 4, 5, 6, 7, 8, 9. The 


usual notation for the series is 
0.d,dod3..., 
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an expression that we shall call a decimal fraction,” or in short, a decimal. The 
coefficients are called decimal digits. Usage varies between countries as to whether 
a full-stop, a comma, or a centred dot is used. 

We begin by showing that every decimal fraction represents a real number. 


Proposition 3.19 Let (dy)7°, be a sequence in which each d, is one of the natural 
numbers 0, 1, ..., 9. Then the series aa dy, /10* is convergent and its sum is in the 
interval [0, 1]. 


Proof It is enough to point out that 


pete 
~ 10k ~ 104 


and the geometric series }°?° , 9/ 10* is convergent with sum 1. 


We have to make an irritating but necessary distinction between the real number 
x and the decimal fraction 0.d;d2d3... that represents it, since two distinct decimal 
fractions can represent the same real number. This follows from the fact, just used, 
that 


or in the usual notation 


This is why we included | in the set of real numbers under consideration in Propo- 
sition 3.19. From this it follows that if d, < 9 then the decimals 


0.d)...dg_1d,9 and 0.d,...dy_ (dk + 1)0 


represent the same real number. 


Exercise Prove the claims made in the previous paragraph. 


Let us call a decimal, of the kind that appears here on the left, a decimal that 
is eventually 9. We shall see that it is only in such cases that two distinct decimals 
represent the same real number. 

We recall the algorithm for determining the decimal digits. For a given number 
x in the interval [0, 1[ we define, by induction, sequences (x;,)?°, (the remainders) 
and (d;)¢-., (the digits), where 0 < x, < 1 and d is one of the natural numbers in 
the range 0, ..., 9. Firstly we set x; = x. When x, (a real number in the interval 


?The terminology here is unconventional. Usually by a decimal fraction is meant a rational number 
whose denominator is a power of 10. 
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0 < xz < 1) has been defined, we let d, be the highest natural number less than or 
equal to 10x,, and set x,4; = 10x, — d,. At each step the kth remainder determines 
the digits d; for j = k and the remainders x; for j = k + 1. 


Proposition 3.20 Let 0 <x <1 and let the sequences (x,)7, and (dk)—-, of 
remainders and digits be defined by the decimal algorithm. Then for each natural 


number n we have 
d, dy dy, Xnt+1 
. (i+ im +i) 


Proof We use induction. The result holds for n = 0 by the definition of x; (and 
note that the sum within parentheses, being empty, is 0). Suppose that it holds for a 
given n. Then x,42 = 10X,41 — dy+41, So that 


10"+! > 10" 10"+1 ~ 


Ta ES (OE 


Xn+2 Xn+1 An+1 = (4 dy An+1 ) 


An obvious consequence is that the decimal algorithm accomplishes what it is 
intended to do. 


Proposition 3.21 Let 0 < x <1 and let the sequences (x,)¢2, and (dx)p2, be 
defined by the decimal algorithm. Then 


CO 


x = 0.didod3... = > 
k=1 


dk 
10*° 


Proof Since 0 < x, < 1 for all n, it is clear by Proposition 3.19 that 


and the conclusion follows since lim,-,., 10~” = 0. 


We conclude that every real number in the interval [0, 1[ can be represented as a 
decimal fraction. We even have the error estimate that using n digits gives an error 
less than 10~”. Moreover every real number in the interval [0, 1[ can be represented 
by a decimal that is not eventually 9. This is because a decimal that is eventually 9 
can be replaced by one that is not eventually 9, and represents the same real number, 
as we have seen. 

We come to the main conclusion regarding decimal fractions. 


Proposition 3.22 The decimal algorithm sets up a one-to-one correspondence 
between real numbers in the interval 0 < x < 1 and decimal fractions that are not 
eventually 9. 
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Proof Consider a decimal 0.d)d2d3... that is not eventually 9, and suppose it repre- 
sents the real number x. Then 0 < x < 1. For we know that 


Leonean =1 
10 102 103 _ 


and at least one of the coefficients d, is not 9. Hence we have 


d, dy d, 
ial aes os nc Ee re 
10 a 102 7 103 - 
Next consider a real number x in the interval 0 < x < 1. We saw that x can be 
represented by a decimal that is not eventually 9. Let 


d d d 
ro 2248 


10 * 1027 108 7 


be such a representation. Now we can show that the decimal algorithm, applied to 
x, produces for the kth digit the displayed coefficient d;, and the kth remainder is 


dy 4 des dky2 
10 102 103 


foes (3.2) 


Xk = 
The proof of this claim is by induction. By definition 


m=. F9° 10) ie ; 
so (3.2) holds for the case k = 1. Suppose that (3.2) is known to hold for a given k. 


Then d 4 
k+1 k42 
10x, =d —— + —— +... 
Xk k + 10 + 102 + 
and the highest natural number less than or equal to this is dj; it cannot be d, + 1 
since at least one of the succeeding digits is not 9, which implies 


dest das kag 


se, 
10’ 102° 103 _ 


Hence 
dest Akzo  dk+3 


10 10? 103 


XkpL = 


and the next digit is d. 
These arguments show that the decimal algorithm produces the unique represen- 
tation of each x in the interval [0, 1[ as a decimal that is not eventually 9. 
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3.9.1 Exercises 


1. The duodecimal system uses the base 12. There are various suggestions for writing 
the digits denoting 10 and 11. We shall simply use A and B. In the rest of this 
exercise we use the duodecimal system and express numbers using the base 12 
(written “10” in the duodecimal system). Now express the following fractions as 
duodecimal expansions: 


1 1 1 1 A 
3° ye A’ B’ B’ 


2. The study of decimals has close links to number theory and algebra. This is 
apparent in the decimal expansion of 1/p where p is a prime. You might like to 
compute the decimal expansion of |/p for prime p, up to, say, p = 31. You will 
see that, excepting the cases p = 2 and p = 5 (the prime divisors of 10), there is 
no initial string, and the period length divides p — 1. 


You can check the following if you are patient. For 1/7 the period is 6, the max- 
imum possible. The maximum period occurs again for the prime denominators 
17, 19, 23, 29, 47, 59, 61, 97 (these are the only ones under 100 for which the 
period is the maximum possible). 


Note. The explanation, which requires some very basic group theory, is briefly as follows. The 
sequence (x,)7°., generated by the decimal algorithm is given by x, = a, /p, where ay is the 
remainder obtained on dividing p into 10”~!. The numbers ay are certain elements of the set 
{1, 2, ...p — 1}. It is known from number theory that this set forms a cyclic group G (usually 
denoted by (Z/pZ)*) of order p — 1, under the operation of multiplication modulo p. The 
elements a, are the remainders of the successive powers of 10; they form a subgroup of G, 
the one generated by 10 (or 10 reduced modulo p; that only makes a difference for p < 10). 
Starting at aj = 1, no repetition can occur until we reach | again. There is therefore no initial 
string. The order of the subgroup of G generated by 10 is therefore the length of the period in 
the decimal expansion of 1/p, and, by Lagrange’s theorem of group theory, it divides the order 
of G, that is, it divides p — 1. If 10 actually generates G we get a period of length p — 1, the 
maximum possible. It is an unsolved problem whether or not this happens for infinitely many 
primes (Artin’s conjecture in number theory would imply that it does). 


3.10 (0) Philosophical Implications of Decimals 


Everyone knows the practical importance of decimals for doing calculations with 
real numbers. In this nugget we will consider an importance of a quite different kind. 

We have not constructed the set of real numbers, instead we posited the existence 
of a set with certain properties. The indefinite article is important. In older treatises on 
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analysis, it was common to construct such a set and prove that it satisfied the axioms 
A, B and C. The axioms are then theorems, as we might say. To construct such a 
set some raw materials are needed. These are commonly the natural numbers, and a 
good dose of set theory allowing the building of some sets. One quickly builds the 
rational numbers, as the set of fractions m/n, with the usual caveat about cancellation 
of factors. So let us assume that we already have a set representing Q. We will look 
at a couple of constructions of R from Q that are historically important. It will be 
seen that quite a lot of set theory is needed to build the required sets, but we will not 
explain it in any detail. 


Dedekind’s construction. We have seen the notion of Dedekind section. Now we 
can apply the same notion to Q. A Dedekind section of Q is a partition of Q into 
two subsets, D; and D, such that neither is empty, every rational belongs to D; or 
D, but not to both, and for all rationals x and y, if x € D; and y € D, then x < y. 
It is common to add the requirement, and we do so, that D; has no highest member. 

According to Dedekind a real number is a Dedekind section of the rationals. The 
rationals are viewed as particular reals by embedding the set of rationals into the set 
of reals as follows. The rational qg is identified with the Dedekind section {D;, D,} 
for which D; = {s € Q: s < q}. 

This leaves the daunting task of defining algebraic operations with, and ordering 
of, the Dedekind real numbers and proving the axioms A, B and C as theorems. 
Actually it turns out that ordering and the completeness axiom are really easy. If we 
have two Dedekind real numbers x and x’, being the sections {D;, D,} and {D;, Di}, 
then x < y shall mean D; C D). The supremum of a set A of Dedekind real numbers 
that is bounded above always exists. It is the Dedekind section whose left set is the 
union of the left sets of the members of A. 


Cantor’s construction. Cantor had a completely different view of the real numbers. 
He saw sequences of rationals as the key to defining them. We have seen that every 
real number is the limit of a sequence of rationals. So we can think of a real number as 
a sequence of rationals, that either converges to a rational, or else wants to converge 
but has no rational to converge to. 

This can be made precise. We single out those sequences of rationals that satisfy 
Cauchy’s condition, in a form that mentions only rationals. Thus a sequence (a;,)?~ , 
of rationals can be called a Cauchy sequence if it satisfies the following condition. 
For every rational ¢ > 0 there exists a natural number JN, such that |a,, — a,| < € 
for allm > N andn> N. 

Now we could say: a real number is a Cauchy sequence of rationals. However 
there is a problem. Different sequences of rationals could have the same limit when 
that limit is rational, which for starters makes it impossible to embed the rationals 
into the reals. Which Cauchy sequence of rationals are we to identify 5 with? This is 
overcome by bunching Cauchy sequences of rationals, that we think should converge 
to the same limit, into sets, so-called equivalence classes. Two Cauchy sequences 
(ax) p2, and (b;)? , belong to the same equivalence class if limy—.oo ax — by = 0 (the 
limit being interpreted in a way that mentions only rationals). 


3.10 (<) Philosophical Implications of Decimals 79 


According to Cantor’s point of view, a real number is an equivalence class of 
Cauchy sequences of rationals. 

Dedekind and Cantor might have had a most interesting argument over who had 
the nicer version of real numbers. But we can also imagine that they meet and have the 
following conversation (in German presumably, but a translation has most helpfully 
been provided). Cantor says “I’m thinking of a real number’”’, (that is, an equivalence 
class of Cauchy sequences of rationals) “and it lies between the natural numbers 0 
and 1”. Dedekind says “I too am thinking of a real number” (that is, a Dedekind 
section of the rationals) “and it too lies between 0 and 1’. Cantor asks “What are the 
decimal digits of your number”? Dedekind replies “They are 0, 1, 0, 2, 0, 3, 0, 4, and 
so on”. “Interesting” says Cantor, “mine has the same digits. We are thinking of the 
same number”. 

We now see the force of Proposition 3.21. All versions of the real numbers are 
really the same, though they may look very different. We can identify a real number 
in Alice’s version with a real number in Bill’s version if they have the same decimal 
digits. This is a big comfort for we want the real numbers to be in some sense unique. 
In contrast, a field (a set with binary operations that satisfy axioms A) is not in any 
sense unique. There exists a field with 2 elements and another with 4. They are clearly 
in no way the same. 

Only one thing can spoil this beautiful uniqueness of the real numbers. As decimal 
digits are nothing but a sequence of natural numbers in the range 0 to 9, Alice and 
Bill have to agree about what a sequence is. More precisely, although they may agree 
that a sequence is an assignment of terms to the natural numbers, they may disagree 
as to what constitutes an admissible assignment. For example it is possible that Bill 
requires the terms of a sequence to be in some sense computable for the assignment to 
be admissible. Alice may say “For each n the n"™ digit is 0 if the twin prime conjecture 
is true and it is | if it is untrue”. Alice’s number is either 0 or 5. Presumably Alice 
does not know which it is (if she did she might qualify for a Fields Medal as the twin 
prime conjecture is unsolved at the time of writing), but has, according to standard 
thinking about sets, successfully defined a sequence of digits, and a very simple one 
at that, being entirely constant. Over this Bill may disagree. 

Logicians say that in each model of set theory there is a unique model of the real 
numbers. This is just a way of saying that if Alice and Bill agree over how to assign 
terms in a sequence, they will have essentially the same real numbers. This is the 
philosophical importance of decimals. 


3.10.1 Pointers to Further Study 


— Mathematical logic 
— Models of the real numbers 
— Axiomatic set theory 
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3.11 (0) Limit Inferior and Limit Superior 


A sequence (a,)°° , that is bounded is not necessarily convergent. Logically it is not 
correct to write the expression “lim,-, 0 a,” without having first shown that the limit 
exists (though we often do so without coming to harm). In this uncomfortable situ- 
ation we can use lim sup,,_, ,, @, (limit superior) and lim inf,,_,.5 a, (limit inferior), 
quantities that always exist if the sequence is bounded but not necessarily convergent. 
If we allow the values oo and —oo then they exist for all sequences whether bounded 
or not. 

Limit inferior and limit superior are often used to prove that a limit exists and to 
calculate it. On the other hand most, if not all, calculations that use these notions can 
be carried out without them, and are not thereby appreciably longer. Use of these 
operations is very much a matter of personal preference, and one can usually get 
on quite well without them. However, it is right to mention that limit inferior and 
limit superior do appear in certain important formulas (such as that for the radius of 
convergence of a power series), and certain theorems (such as in Fatou’s lemma of 
integration theory). 

Let (a,)°° , be a bounded sequence. We define the limit superior of the sequence 
(Gn )ro | by 


lim sup a, := lim (sup ag). 
noo NCO k>n 


To explain this better we let 


hy := Sup ag = SUP{Gn, An41, An42, «4. 
k>n 


As the sequence (a,,)°°., is bounded above, the number /,, is certainly finite. In fact 
we have h, < sup,., ae. As the sequence (a,)°°_, is bounded below, the sequence 
(hn)eo, is bounded below; in fact hy, = infx>1 ax. Moreover h,, is decreasing, being 
the supremum of a set that shrinks with increasing n. The limit lim,_,.. A, therefore 
exists and is a finite number. 

In a similar way we define the limit inferior by 


lim inf a, := lim (inf ax). 
noo n—>oo k>n 


Now it is easy to obtain the following rules. For the moment all the sequences are 
supposed to be bounded. 
(i) inf a, < liminf a, < limsupa, < sup dy. 
n>1 noo n> 00 n>1 
(ii) A sequence (a,,)°° , is convergent if and only if 


lim inf a, = lim sup a,. 
Be OP n->0o 
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Given this equality the common value is limp Gn. 
Gil) If ay < by < Cy we have 


lim supa, < limsupb, < lim supc, 
n—->>Oo n—> Oo noo 


and 


lim inf a, < lim inf b, < lim inf c,. 
noo noo n—->CoO 


Exercise Prove these rules. 


We can also allow the values oo and —oo. If a, is not bounded above we set 
lim sup,,,45 An = ©. If a, is bounded above, but sup;.,, a, tends to —oo, we set 
lim sup,,-, 55 An = —0o. Similarly we can assign infinite values to lim inf noo An. By 
this means we can define limit inferior and limit superior for arbitrary sequences. 


Examples 


(a) (1, -1, 1, -1, 1, -1,...). 


Limit inferior is —1, limit superior 1. 


4 3 6 
(b) (3, 2 > i, 2.3 4? 25 5? 2 6° 2, A ysee)s 


Limit inferior is 1, limit superior 2. 


1 1 2 2 3 3 4 4 5 3 
(CS) (gy gn ge ar Ge ae Se ee Ri eee 


Limit inferior is —1, limit superior 1. 
(d) (, 3, 2, 4, 3, 5, 4, 6, 5, 7,...). 

Limit superior and limit inferior are both oo, which is also the limit. 
(e) dy, =sinn,n = 1,2, 3,.... 

Limit inferior is —1, limit superior 1. 
Exercise Check the above claims. For example (e) you will need to know that sin x 
is continuous, periodic, has maximum value 1, minimum —1, and its period 27 
is irrational. So you might like to wait until these concepts have been properly 


treated in later chapters. You might also find useful Sect. 2.5 Exercise 3 and Sect. 3.5 
Exercise 3. 


We can show limit superior in action by proving the following result (though 
actually a proof avoiding it is not longer). 


Proposition 3.23 Let (a,)°°_, be a real sequence such that limy-,o0 An = t. Let 0, = 
( et ax) /n. Then limp On = t. The conclusion also holds ift = oo ort = —00. 
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Proof We write the proof for the case when f is a finite number, leaving the cases 
t = +00 as an exercise. We first write 


Let ¢ > 0. Choose N, such that |a, — t| < ¢ for alln > N. Fora givenn > N we 
split up the sum into terms with k < N — 1 and terms with k > N. We find 


N-1 
1 n-N+1 
lon —t| < a ) lax — t| + rs &. (3.3) 
k=1 


This holds for alln > N. Let n — oo (but hold WN and ¢ fixed). The right-hand side 
has the limit ¢ and we conclude (without knowing whether the left-hand side has a 
limit) 
lim sup |o, — t| < . 
noo 


Now this must hold for all ¢ > 0 so in fact lim sup,,_, ,, |o, — ¢t| must be 0, that is, 
limy—+oo |On — t| exists and is 0. This gives lim; On = f. 


3.11.1 Exercises 


. Prove the cases t = +00 of Proposition 3.23. 
. Finish the proof of Proposition 3.23 from Eq. (3.3) without using limit superior 
(it shouldn’t be longer). 

3. Let (a,)°°, and (b,)7°., be convergent sequences and let their limits be s and 
t respectively. Let the sequence (c,)°°, be defined by c, = Ls ae Agbyn_kat- 
Prove that limy_ 90 Cn = St. 

4. Prove the following generalisation of Proposition 3.23. Suppose that all terms of 
the sequence (c,)°., are positive and that the series )°°~_, cn diverges. Let (a,)™ 

be a sequence with limit ¢ (may be +00). Define 


Noe 


nh 
kar Chak 
—_ n 


On : 
k=1 Ck 


Show that lim,_..5 0, = ft. 
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3.11.2 Uses of Limit Inferior and Limit Superior 


The dichotomy of finite versus infinite often lurks behind the appearance of limit 
inferior or limit superior. Given a sequence (a,,)°° ,, and a predicate P(x) applicable 
to real numbers, we shall say that P(a,) is eventually true if there exists N, such 
that P(a,,) is true for all n > N. This is the same as saying that P(a,,) is false for, at 
most, finitely many place numbers n. We shall say that P(a,) is infinitely often true 
if P(a,) holds for infinitely many place numbers n. 

Now let (a,)°°, be a bounded sequence. The reader is invited to prove the fol- 


lowing characterisations of limit inferior and limit superior: 


(i) liminf,-..5 @, = t if and only if the following condition holds: for each ¢ > 0 
the inequality a, >t — e€ is eventually true and the inequality a, < t+ is 
infinitely often true. 

(ii) lim sup, dn = t if and only if the following condition holds: for each e > 0 
the inequality a, < t+ is eventually true and the inequality a, > t — & is 
infinitely often true. 


Certain frequently cited properties that a sequence may possess can be expressed 
succinctly with limit superior or limit inferior. In the following table we exhibit four 
such properties opposite their equivalent, and less wordy, formulations using limit 
superior. The abbreviation ‘i.o.’ stands for ‘infinitely often’. 


(a) lim sup, ,o <¢ There exists t’ < t, such that a, < t’ eventually. 
(b) lim sup, _.4, Xt For all ¢ > 0, a, < t + € eventually. 

(c) lim sup,_,5. >t There exists t’ > t, such that a, > t' io. 

(d) lim sup, 4. = t For alle > 0,4, >t+ € io. 


The reader is invited to prove these claims, and formulate similar ones using limit 
inferior. 

The ratio test can be generalised using limit inferior and limit superior, in a form 
that does not require that a,41/a, converges. 


Proposition 3.24 Let )°*°, a, be a positive series in which no term is 0. The fol- 
lowing conclusions hold: 


an+ 


(1) If lim sup, _,., * <1 the series is convergent. 


n 
(2) Tf lim inf, 06 aa > | the series is divergent. 
n 
Proof In case 1 there exists t < 1, such that a,+1/a, < t eventually. This implies 
(as the reader should check) that there exist no and C, such that a, < Ct” for all 
n > no. We obtain convergence of the series )~~_, a, by comparison with the series 
ees 
In case 2 there exist t > 1, such that a,,,/a, > t eventually. This implies diver- 
gence since a, cannot tend to 0. 
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One sometimes encounters limits of the form lim,_, 95 al! ” and their treatment can 
be puzzling. The following result is often useful. It is another case where limit inferior 
or limit superior can be used optionally in the proof. We will need the known limit, 
for a given positive constant b, that limy—.oo b'/" = 1, About the n™ root function 
x!/": we will show in detail later that every positive real number has a unique positive 
n' root. Moreover the n" root function is increasing. 


Proposition 3.25 Let (a,)°., be a real sequence such that a, > O for alln. Assume 
that liMy-so0 Gn /An—1 = t. Then limyso0 44!" =t. The conclusion also holds if 


t=O. 


Proof Obviously t > 0. We shall write out the proof in the case that ¢ is a finite, 
positive number. The cases t = 0 and t = ov are left to the exercises. 
Let ¢ > 0. Reduce ¢, if necessary, so that ¢ < ¢ (it is here that we want ¢ > 0). 
There exists NV, such that 
an 
t—é< — <t+e 
Qn-1 


for alln > N. Then, for alln > N, we have 
(t — €)Gy_1 < Gy < (t+ €)Gy-1 
and by induction we find 
(t —¢)" Nay < an < (t te) Nay 


for alln > N + 1. Taking the n" root, and using the fact that the n" root function is 


increasing, we find 
1 1 


j=" 2 1s 
(t—e€) "ay <an <(t+e) "ay 


for alln > N+ 1. Letn > oo. Now 


N N 1 
Ge) eS f=, Ce) = Sie, ay 


(all three follow from the limit lim,_, 55 b!/” = 1). We conclude that 


1 1 
t—e <liminfa; <limsupa; <t+e. 
noo noo 


This holds for all e > 0. We conclude that lim inf;,_, 55 al! ” and lim SUPy->00 al! " are 


equal to f. 
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3.11.3 Exercises (cont’d) 


5. What minor modification is needed in the proof of Proposition 3.25 for the case 
t=0? 

6. Finish the proof of Proposition 3.25 by considering the case t = oo. 

7. Let lim sup,_,,,@, = t. Show that there exists a subsequence (a,,)P°,, such 
that lim,—oo a, = t. Here we may have tf = oo. A similar result holds for 
lim inf p+ 90 An- 

8. Give another (actually the third of this text) proof of the Bolzano—Weierstrass 
theorem (Proposition 3.10) using the previous exercise. 

9. Give another proof that a sequence (a,,)°°, that satisfies Cauchy’s condition 
(Sect. 3.7) is convergent, by showing that lim inf, —,.0 dy = lim sup,,_, ,, Gn- 


3.11.4 Pointers to Further Study 


— Semi-continuous functions 
— Radius of convergence 
— Fatou’s lemma 


3.12 () Continued Fractions 


Decimals provide the best known, and perhaps the most practical, way to approximate 
a real number by rational numbers. But they come at a price. Consider the following 
example: 

x = 0.797997999799997999997.... 


The sequence of digits consists of isolated instances of the digit “7” interspersed 
with lengthening strings of the digit “9”. This ensures that the number is irrational. 
If we truncate at the n™ digit we obtain a rational approximation of the form a/10”, 
and an error between 7/10”"*! and 1/10”. 

The price of this error is the size of the denominator. To ensure an error less than 
1/10” we have to use a fraction with denominator 10”. This is an expensive error, 
but it is possible to do much better. Approximations are possible with fractions a/b 
for which the error is less than 1/b*. They can be obtained through the continued 
fraction algorithm. 

Whereas a decimal representation of a number x uses a sequence of digits from 
the range 0, 1, ..., 9, a continued fraction uses a sequence of integers, which can be 
arbitrarily large. This sequence is finite if and only if x is rational, unlike a decimal 
expansion, which can have infinitely many non-zero digits whilst representing a 
rational. The integers of the continued fraction are generated from the number x by 
a simple algorithm, which we describe in the next paragraph. 
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Let x be a real number. Using the notation [x] for the highest integer less than or 
equal to x, we set 


If x1 A O we set 


We continue inductively. Having reached a,_; and x,, and assuming that x, 4 0, we 


set 
~| 1 
an =| —]|>, X41 = — ah. 
Xn n 


The process terminates if, for some n, we have x, = 0. Since x, lies in the interval 
0 < x, < 1 forevery n > 1, itis clear that all the integers a,, except possibly for ag, 
are positive. 

If the sequence terminates with a,, (because x, = 0) then x must be rational. In 
fact, unravelling the reciprocals we find that 


Xx = a+ 


This expression is known as a continued fraction. The study of them is really old 
with hints of them in ancient mathematics; for example they are closely related to 
the Euclidean algorithm. 

There is a short notation for a continued fraction. We denote the above expression, 
whether or not the entries are integers, by 


[ao, a, a2, bea an]. 


If x is irrational then the sequence of integers cannot terminate. We would then 
like to write 
x = [d0, a, a,...]. 


The right-hand side can be interpreted as the limit 


lim [a9, 21, 42, «.-, An] 
noo 


if the limit exists. In fact the limit does exist, and it really does equal x. The proof is 
a bit lengthy and substantial parts of it will be left to the exercises. 
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We begin by writing 
Pn 


n 


= [do, 4), a2, ..., An] 


where p, and q, are coprime integers (that is, integers with highest common divi- 
sor 1). The fraction [ag, a), a2, ..., dn] is called a convergent (anticipating the result; 
but it is convenient already to have a name for it). 

There is a simple way to calculate the sequences (px)¢_9 and (qx)k—9 from the 
sequence (a,)i_, without having to pick one’s way through a pile of nested recipro- 
cals. Both sequences satisfy the same recurrence relations, namely 


Pk = 4k Pk—-1 + Pk-2, Ik = UeGe-1 + Qe-2,, kK = 2,3,.... (3.4) 
with the initial values pp = ao, Pp) = 494, + 1, go = 1, 9g) = a. 
Proof of the Recurrence Relations We prove that p;, and q, satisfy the recurrences 
(3.4) by induction. As induction hypothesis we assume that for any continued fraction 
of length less than or equal to m, the corresponding numerators and denominators 


satisfy the recurrence relations (3.4). 
For our given x we let 


= [ay, 42, 3, ..., a], k= 1,2, 3, ee 


where the integers p;, and gq; are coprime. The induction hypothesis is supposed to 
hold for the fraction [a), a2, a3, ..., dm], which is of length m, so that we have 


a ed / / Fs / 
Pm = 4mPn—-1 + Pm—2> Im = 4 dn-1 F Ym—2° 
Furthermore, by the definition of continued fraction, the relation 


Pk _ a 4 Mh 
qk Px 


holds for all k and, recalling that p), and q; are coprime, we see that 
Pk = 40PK+ > Tk = Pre 
Therefore, after some algebraic manipulation, we find 


Pm = 4m Pm-1 + Pm-2, In = GnGm-1 + YIm-2 


and the proof of (3.4) is complete. 


The rest of the proof that lim; Pn/dn = x iS given in steps in Exercise | and 
builds almost entirely on the relations (3.4). 
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3.12.1 Exercises 


1. Complete the proof that limy_. oo Pn/dn = x in the following steps: 


(a) Show that 
PnQn+1 ~ Pn4+19n = (-1)""! LS 0, 1 ee oo 


(b) Show that 
Pn Pn+l1 
dn Gn+1 


21 
Gn Qnt1 ; 


(c) Show that if the sequence a, does not terminate then the integer sequences 
Pn and q, are strictly increasing and satisfy limy—oo Pn = liMp+o0 Gn = &. 

(d) Assume that the fraction does not terminate and set y, = 1/x,. Show that, 
for each n, we have 


xX = [d0, 4, ..., Qn—-1, Yn]. 


Here a, is replaced by yy. 

(e) Assume that the fraction does not terminate. Then for each n the number x 
lies between [do, a1, ...d,] and [ao, a), ..., dn, Angi]. 

(f) Show that 


This completes the proof that p,/g, — x. It gives much more. We get an infinite 
sequence of rational approximations a/b to x such that the error is at most 1/b*. 
The price of an error less than ¢ is a denominator at most 1/,/e. This is much 
better than what can be achieved by decimal expansions. 


2. Show that the continued fraction of a rational number terminates. 


Hint. Show that if x is rational there can be at most finitely many rational approx- 
imations a/b that satisfy 

1 
b2 . 


Pe 


a 
s 


> 


From a non-terminating fraction we get infinitely many such approximations. 
3. Show that _ 
J/2 = [1,2]. 


The overline means that the entry “2” repeats indefinitely. Tabulate values of px 
and q, and observe that ps/qs = 99/70, which gives /2 with an error less than 
i9-*, 

4. With y, as defined in Exercise 1(d), show that for all k we have 
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xe PEM + Pr-2 
Gk-1 Yk + Gk—-2 


5. With y; as defined in Exercise 1(d), show that yy, = [ap, dg41, Ar+2, ..-], where 
X = [do0, ad, a, ...]. 

6. Suppose that the continued fraction of x is periodic, that is to say, it is of the form 
[a0, a, ..-Ay], the overline indicating that the string of entries repeats indefinitely. 
Show that x satisfies a quadratic equation with integer coefficients. 

7. Calculate the values of the following continued fractions. 


(a) [1], that is, [1, 1, 1, ...]. This is surely the simplest continued fraction that 
represents an irrational number. 

(b) [1, 2] 

(c) [1, 2, 1]. 


8. Show that if the continued fraction of x has the form 


[do, |, ---Am, Am+15 Amn), 


consisting of a string that repeats indefinitely after an initial string, then x is an 
irrational root of a quadratic equation with integer coefficients. 

Hint. Use Exercises 4, 5 and 6, and observe that if y; is the root of a quadratic 
equation then so is x. 


Note. The converse is also true. Every quadratic irrational has a continued fraction of this form. 
This was shown by Lagrange. The proof is not hard but requires a little number theory. 


9. The continued fraction algorithm gives a handy way of finding integers x and y 
that satisfy ax — by = | for a given pair of coprime integers a and b. 
Hint. Stare at the result of Exercise 1(a). This also shows that continued fractions 
are closely related to the Euclidean algorithm; the latter is often used to find x 
and y. 


3.12.2. Pointers to Further Study 


— Number theory 
— Irrationality theory 
— Diophantine analysis 


Chapter 4 ®) 
Functions and Continuity cies 


A function of a variable quantity is an analytic expression 
composed in any way whatsoever of the variable quantity and 
numbers or constant quantities 


L. Euler 


4.1 How Do We Talk About Functions? 


A function or mapping assigns to each point in a set A a point in a set B. We shall 
allow a rather wide scope for the understanding of “assigns”. It does not have to be 
an assignment using a formula in the ordinary sense, though that is very often the 
case in analysis. 

An exact definition of the concept of function can be based on set theory. It 
seems to dispel all mysteries connected with the meaning of assignment, identify- 
ing a function with a certain set (its graph in fact), but the clarity thus gained is a 
little misleading for it raises the question of what sets are to be allowed. It is ques- 
tionable whether the set-theoretical definition of function is needed for fundamental 
analysis. 

As with any other object of interest in mathematics, we use letters to symbolise 
functions; “f” is often the first choice, if available, followed by “g”. The set A is 
called the domain of the function whilst the set B is called its codomain. We write 


f:Aa7oB 


which is read “f is a function with domain A and codomain B” or “ f maps A to B”’. 
If x € A (that is, if x is an element of A), we denote the element of B that f assigns 
to x by f(x) and call it the value of f at x. 
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Although it should be clear from the definition, it is worth emphasising that a 
function assigns only one value to each point in its domain. If we wish to say that 
the square root of 4 is plus 2 or minus 2, then we are not using a function. 

The words “function” and “mapping” mean the same but “function” is mostly 
used when the codomain consists of numbers, rather than vectors or more exotic 
objects. 


4.1.1 Examples of Specifying Functions 


Here are some examples of how we specify a function. They show some acceptable 
ways to assign values of varying degrees of formality. 


(a) f:ROR, f@)=x’. 


This says that f maps R to R and assigns to each x its square x”. We say informally 
“ f is the function x”. 


(b) f:(0,col>R, f(x) =r. 


This says that f maps the set of positive real numbers, together with 0, to R, and 
assigns to each such number x its square root ./x. We say informally “f is the 
function /x”. 


10, if0 <x < 100 
(c) f :]0,500] > R, dona es if 100 < x < 500 


The presentation here is called specifying a function by cases. This function could 
be a list of postal charges. 


(d) The function f : [0, l[— R, where f(x) = 1 if the digit 9 appears in the 
decimal representation of x and f(x) = 0 otherwise. We use a decimal 
representation that does not end in repeating 9’s. 


There seems to be no practical general way to compute f (x) in this example, although 
we do have an algorithm for the decimal digits, so things are not as bad as they might 
be. 

Another convenient, and less formal, way to specify functions is typified by the 
example: 


(e-) f@=vx, (> 0). 


The codomain is not given (it is not always important), but the domain is indicated, 
although sets are not mentioned. 
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Fig. 4.1 Two views of a y 
function 
___ Agraph 

y = f(x) 

An assignment = VA 

a+ f(x) 

as 
« R R 


Often the domain is not mentioned at all when we specify a function informally, 
but it is inferred, as in the example: 


(« - D@—2) 
(ec) ——————_.. 
(x — 3)(x — 4) 
Here we mean the function f with domain A = R \ {3, 4} and codomain R, such 
that 
_ @-D@-2 
ID= ao agan oo 


The graph of a function f with domain A C R and codomain R is the set of all 
pairs (x, y) such that x € A and y = f(x). We can view the graph as a curve (of 
some sort) in the plane with coordinates x and y, and this is a useful way to visualise 
jf. An example of Weierstrass of a continuous function that is nowhere differen- 
tiable, and the space filling curves of Peano, show the limitations of such a picture. 
Figure 4.1 illustrates the two ways to view a function: as a graph or as an assignment. 

The domains of functions considered here will be subsets of R. It is clearly too 
limiting to consider only functions with domain R; the above examples illustrate this. 
However the typical domains for calculus are intervals, or finite unions of intervals. 

Sometimes we speak of a function y = f(x), instead of just f, as if we have the 
graph in mind. Or else we are thinking of x and y as variables and expressing a 
relation between them, a point of view common in physics (think of pressure P and 
volume V, and Boyle’s Law of ideal gases). We may even say “the function f(x)”, 
although strictly speaking f(x) would be the value that f assigns to the number x. 
It offers a visual cue that a function is referred to, rather than a number that might 
be denoted by f. One should bear in mind that mathematics is not only a mode of 
thinking, but also a mode of communication. 


4.2 Continuous Functions 


If a parcel weighs 100 grammes, and the postal charges are as in example (c) of the 
last section, we will not be happy to pay 20 pounds in postage. We might object that 
it really weighs 99.99 grammes and the post office should have their scales checked. 
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The problem here is that the function in question is discontinuous. The definition of 
continuity of a function f at a point xo seems at first glance to be designed to exclude 
a jump in the graph of the function at x9. This is a bit of an oversimplification; we 
will see later that it does more than this and excludes some other types of undesirable 
behaviour as well. The implications of the definition are not very obvious and will 
only gradually become clear. Until then the reader is asked to take on trust that the 
definition of continuity presented here is appropriate. 

In the definitions below, the domain of the function is a subset A of R. In most 
practical cases A is an interval or a finite union of intervals. 


Definition Let f : A R where A is a subset of R. Let x9 € A. We say that the 
function f is continuous at xo, or that xo is a point of continuity of f, if the following 
condition is satisfied: 


For each € > 0 there exists 5 > 0, such that | f (x) — f(xo)| < ¢ forallx in A 
that satisfy |x — xo| < 6. 


If f is not continuous at xo we say that f is discontinuous at xo, or that xo is a point 
of discontinuity of f. 


Definition Let f : A > R where A is a subset of R. We say that the function f is 
continuous if it is continuous at every point of A. 


As in the definition of limit of a sequence, we have made a small concession to 
natural English. It would be more precise, but less natural, to define the condition of 
continuity as follows: for each e > 0 there exists 6 > 0, such that for all x in A that 
satisfy |x — xo| < 6 we have | f(x) — f(xo)| < ¢. In a simplified first-order logic 
notation we can lay bare the logical structure of this condition: 


(Ve > 0)(43 > 0)(Wx € A)(lx — x01 < 8 > If (X) — F@o)I <8). 


In most cases A is an interval. Even so, this is not quite general enough for our 
purposes. In the following pages, when we write f : A > R we shall mean that A 
is a subset of R (not necessarily an interval). 

The definition of continuity, like the definition of limit and the nature of the real 
numbers, took a long time to crystalise into its present form. It seems to have been 
thought that continuity must be seen as a property of the function as a whole, akin 
to saying that its graph hangs together in one piece; or even more loosely, that the 
graph can be drawn without lifting the pencil from the paper. It finally became clearer 
that the way forward was to define continuity at a point first, and only then to define 
continuity as a whole to mean that the function was continuous at each point. None 
of this was originally at all obvious. 
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4.2.1 Exercises 


1. Because the set A is quite arbitrary the definition of continuity has a perhaps 
unexpected consequence. We say that a point xo € A is an isolated point of A 
if there exists 6 > 0 such that xo is the only point both in A and in the interval 
]xo — 6, Xo + 6[. This means that if x € A and |x — xo| < 6 then x = x. 


Show that if f : A > R and xp is an isolated point of A then f is automatically 
continuous at xo. 


2. Is 0 a point of discontinuity for the function 1/x? 


3. For each xo in its domain, and for each ¢ > 0, find a suitable 6, thus proving that 
the following functions are continuous: 


@ f@)=1 
(b) f@)=x 
© fa)y=x 
@ fa=-. 


4. Let f : [0, 1[—> R be the function defined in Sect. 4.1 Example (d), that assigns 
1 to x if the decimal expansion of x contains the digit 9, using if possible the 
terminating expansion, and assigns 0 otherwise. 

Show that f is continuous at x if and only if f(x) = 1. 


5. Suppose that the function f : A — R is continuous at the point c and that 
f(c) <d [respectively f(c) > d]. Show that there exists 6 > 0, such that 
f(x) < d [respectively f(x) > d] for all x in A that satisfy |x —c| < 6. 


6. A function f with domain A is said to be upper semi-continuous [respectively, 
lower semi-continuous] at a point x in A if the following condition is satisfied: 
for all e¢>0O there exists 6>0, such that f(x) < f(xo) +6 
[respectively, f(x) > f (xq) — €] for all x in A that satisfy |x — xo| < 6. 


(a) Show that f is continuous at x9 if and only if it is both upper semi-continuous 
and lower semi-continuous at xo. 

(b) Suppose that f is upper semi-continuous [respectively lower semi- 
continuous] at a point c. Suppose that f(c) <d [respectively f(c) > d]. 
Show that there exists 6 > 0, such that f(x) <d [respectively f(x) > d] 
for all x in A that satisfy |x —c| <6. 


4.2.2 Limits of Functions 


Let f : A > R where A is a subset R. In practice A is often an interval, or an interval 
minus a finite set of points. The reader should recall the definition of limit point of 
a set (see Sect. 3.5). 
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Definition Let c be a limit point of the set A and let t be a real number. We say that 
the limit of f at c exists and equals ¢, and we write 


lim f(x) =f, 


See gs 
if the following condition is satisfied: 


For every € > 0 there exists 5 > 0, such that | f (x) — t| < ¢ forall x in A that 
satisfy 0 < |x —c| <6. 


We sometimes write more informally 
f@)—->t a x->c 


to express that the limit of f at c equals f. 


Note the following points: 


(a) Because c is required to be a limit point, there exists x in A withO < |x —c| <6. 
This means that the situation, that the limit exists and equals ¢, cannot arise by 
default, which would happen with any number ¢ whatsoever if there were no 
points x to be tested. 

(b) If A is an interval (commonly the case), then c is either in A or else c is an 
endpoint (or both). 

(c) If cis in A then the value f(c) has no influence on the limit. 

(d) The variable x in the expression “lim,_,. f(x) =?” is a bound variable. Any 
other letter may be used instead of “x”, forexample “lim,_,. f(q) = t” has the 
same meaning as “lim,_.. f(x) =?”. 

(e) The limit t may, or may not, be a value of the function f at some x not equal to c. It 
is quite possible for f to take the value ¢ at points in the interval Jc —h,c + Al, 
excluding c, for every h > 0. Confusion over this caused problems in early 
thinking about limits (as was also pointed out in connection with the limit of a 
sequence). 


Although it might be thought nice to display 6 as a function of ¢, for example to 
get explicit error estimates, this is not necessary to verify the definition of limit, nor 
is it always helpful. To produce a 6 that works for a given ¢ some arbitrary choices 
may have to be made, such as that of selecting in a non-explicit fashion a number 
from a non-empty set. It is not hard to see that the set of possible 5’s for a given ¢, 
if bounded above, is an interval of the form ]0, dmx]. In such a case we could, if we 
wished, define the function 6(€) = dmax, but this is not necessarily useful. 


Exercise Check the claim made at the end of the last paragraph about the set of 
possible 6’s forming an interval. 


The following result is often needed: 


4.2 Continuous Functions 97 


Proposition 4.1 Let c be a limit point of the set A and suppose that the limit 
lim,-+¢ f(x) exists (and is a finite number). Then there exists h > 0, such that f 
is bounded on the set |c —h,c +h[ NA. 


Proof Using ¢ = 1 we see that there exists h > 0, such that | f(x) —t| < 1 for 
all x in A that satisfy 0 < |x —c| <h. For such x we have | f(x)| < 1+ |t|. Let 
K=1+4t|ifc ¢ A and K = max(1+ I[t|, | f(c)|) ife € A. Then | f(x)| < K for 
allx inJc—h,c+h[N A. 


The limit lim,_.. f(x), if it exists, is unique. This was anticipated in our use 
of the definite article. It is impossible for distinct real numbers s and ¢, that both 
lim,+- f(x) =s and lim,_., f(x) = t. If it was so we could choose e, such that 
O0O<ée< S|s — t|, and find 6 > 0, such that | f(x) — s| < e and also | f(x) —t| <e 
for all x in A that satisfy 0 < |x — c| < 5. Such points x exist since c is a limit point 
of A. But then we would have 


Is—t] <ls— f@)|+1f@) —t| <2e <|s—tl, 


which is impossible. 


4.2.3 Connection Between Continuity and Limit 


Arguments about continuity can often be rephrased as arguments about limits. This 
is due to the following result. 


Proposition 4.2 Let f : A— Rand let c be a point in A that is also a limit point 
of A. Then f is continuous at c if and only if f (c) = lim,_,¢ f (x). 


Proof Assume first that f is continuous at c. Let ¢ > 0. There exists 6 > 0, such 
that | f(x) — f(c)| < eifx € A and |x —c| < 6, and therefore in particular if x € A 
and 0 < |x —c| < 6. This says that f(c) = lim,_.,. f(x). 

Next assume that f(c) = lim,_,. f(x). Let ¢ > 0. There exists 6 > 0, such 
that | f(x) — f(c)| <¢ if x € A and 0 < |x —c| <6. But then we also have 
| f(x) — f(c)| < e if x € A and |x —c| < 4, since it obviously holds when x = c. 


If c is in A but is not a limit point of A, then f is automatically continuous at c. 
We can choose 6 > 0 so small that the conditions |x — c| < 6 and x € A are only 
satisfied when x = c, and then f(x) — f(c) = 0. 


4.2.4 Limit Rules 


Limit rules allow us to establish new limits from old ones, usually without having 
to use the definition of limit. The limit rules for functions are similar to those for 
sequences, and the similarity extends to their proofs. 
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Proposition 4.3 Let f:A—> R, g:A—-R, let c be a limit point of A and let 
lim,+¢ f(x) = 5s, limy- g(x) = t. We have the following rules: 


(1) (Sum) Vimy 5. f(x) + g(x) =5 +t 
(2) (Product) lim,y+- f(x): g(x) =s-t 
(3) (Absolute value) lim,-,.¢ | f(x)| = |s| 
(4) (Reciprocal) If s 4 0 then lim ——~ = -. 

reef) 5 
Note. In rule 4 the function 1/f(x) is possibly not defined for all x € A because 
f can have zeros. However, if lim,_,. f(x) 4 0 the zeros of f, if any, other than 
possibly c itself, are a safe distance from c. More precisely there is an interval 


I = ]c —h,c+Al[, such that f has no zero in 7. A \ {c}, on which domain we may 
define 1/f. 


Proof of the Limit Rules (1) Let e > 0. We choose 6 > 0, such that | f(x) — s| < e/2 
and |g(x) —t| < e¢/2 forall x € A that satisfy 0 < |x — c| < 6 (the same 6 for both 
f and g). For such x we have 


€ 
2 


=€. 


|f(@) + g(x) — (8 +91 <|f(@) — 51+ lg) -t] < 5 4 


(2) Since the limit lim,_,. f(x) exists, there exist K > 0 andh > 0, such that for all 
xin]jc—h,c+h[ NA we have |f(x)| < K. 

Let ¢ > 0. Choose 6; > 0, such that | f(x) — s| < ¢ and |g(x) —t| < « for all 
x € A that satisfy 0 < |x —c| < 6,. Set d = min(6), h). If 0 < |x —c| < 6 we have 


If@)g@) —stl= [f@)g@) — f@)tt+ fat — st 
<= lf@lle@) —t]+ lel F@) —s| 
< Ke+ltle 
< (K+ the. 


We conclude that lim,_,. f(x) - g(x) = 5 -t. It may help to reread the discussion in 
the proof of Proposition 3.5 in connection with the product of sequences. 


(3) The proof is almost identical to the corresponding one for sequences. 


(4) We have 
1 


| 1 _ ls = fO)| 
f(x) os 


Isl FG] 


By assumption lim,-,. | f(x)| = |s| and s # 0. There therefore exists h > 0, such 
that | f(x)| > $|s| for all x that satisfy 0 < |x —c| < h. For such x we have 


| 1 1 
f(x) s 


2 
< Gals se. 


The conclusion follows from that fact that the right-hand side has the limit 0. 
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The justification for the last line of the proof of rule 4 is left to the reader. In 
fact there is a choice. One can give an argument starting “Let ¢ > 0” as in rules 
1 and 2. Alternatively one can avoid mentioning ¢ at all by using a version of the 
squeeze rule for limits of functions; compare the corresponding rule for sequences, 
Proposition 3.8. The rule for functions reads as follows: 


Let f, g and h be functions with domain A, and let c be a limit point of A. 
Assume that there exists 5 > 0, such that g(x) < f(x) < h(x) forallx inA 
that satisfy 0 < |x —c| <6, and that lim,-,, g(x) = lim,_,- h(x) = t. Then 
lim, .. f(x) =t. 


The proof of the squeeze rule is also left to the reader. 


4.2.5 Continuity Rules 


The limit rules give rise to continuity rules. Let f : A— R and g:A— R be 
continuous at c. Then 


(i) The sum f + g is continuous at c. 
(ii) The product f - g is continuous at c. 
(iii) The absolute value | f| is continuous at c. 
(iv) If f(c) #0 then the reciprocal 1/f is continuous at c (where 1/f is defined 
sufficiently close to c to avoid zeros of /). 


To begin the wholesale production of continuous functions we need to settle two 
initial cases, left to the reader to verify: 


(a) The constant function f : R > R, f(x) = C forall x is everywhere continuous. 
(b) The function f(x) = x (identity function) is everywhere continuous. 


From these and the continuity rules (i-iv), we immediately obtain a large number of 
continuous functions: 


(c) The function x” (where n is a fixed natural number) is continuous. 

(d) The polynomial f(x) = a,x” + Gn_\x"—! +++ + ap is continuous. 

(e) If f and g are polynomials then the rational function f(x)/g(x) is continuous 
(on its domain naturally, which excludes the zeros of g). 


4.2.6 Left and Right Limits 


Let f : Ja, b| ~ Rand leta < c < b. Wecanconsider separately the two functions 


fi:lJa,cl>R, fi@)=f@), (@a<x<c) 
fo: lcobl[>R, fh@®) =f), (c<x <b). 
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Fig. 4.2. The simplest y x = a: jump discontinuity 
discontinuities : ee 
x = b: removable discontinuity 


9 e 


—~ 


° 


They are called the restrictions of f to the intervals Ja, c[ and ]c, b[. 

The limits lim,_,. f| (x) and lim,-., f2(x) are called the left-hand and right-hand 
limits of f at c. They are jointly called one-sided limits. The usual notation, which 
does not mention f; or f2, refers to them by 


lim f(x) and jim Sf), 
or else even more simply, by 


f(c—) and f(c+). 


Furthermore when f has the domain Ja, b[ it is quite common to denote the 
limits lim,+, f(x) and lim,_., f(x), quite unnecessarily, by lim,.,4 f(x) and 
lim,-.,— f(x), as a notational reminder that x can only approach a from the right 
and b from the left. 


Proposition 4.4 The function f : \a, bl > Ris continuous atc € |a, b{ ifand only 
if the one-sided limits f (c—) and f (c+) exist and are equal to f (c). 


The proof of this is rather obvious, but the proposition is worth stating because it 
suggests somewhat graphically one of the characteristic ways we think about failure 
of continuity. If the left and right limits both exist but are unequal we say that f has 
a jump discontinuity at x = c. The difference f(c+) — f(c—) is called the height 
of the jump, or simply the jump, at c. This being non-zero is the simplest way that a 
function f can be discontinuous, but it is by no means the only way. 

It is possible for a function to be discontinuous at c because one or both of the 
one-sided limits lim,_,-_ f(x) and lim,_,.4 f(x) fail to exist. It is also possible that 
the one-sided limits are equal, so that there is no jump, but they are different from 
f (c). Then f is discontinuous at c because, somehow, the “wrong value” is assigned 
to f atc. This is sometimes called a removable discontinuity; by changing the value 
at c the function can be made continuous there (Fig. 4.2). 
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4.2.7 The Limits lim, 0 f (x) and lim, -~o0 f(x) 


Let us suppose that the domain of f includes an interval of the form Ja, oof. 


Definition We say that the limit lim,... f(x) exists and equals ¢ if the following 
condition is satisfied: 


For each € > 0 there exists K, such that | f (x) — t| < ¢ for all x that satisfy 
the inequality x > K. 


In a similar way, lim,_,_.. f(x) = t means that for each ¢ > 0, there exists K, 
such that | f(x) — t| < ¢ for all x that satisfy x < K (of course we assume that the 
domain of f includes an interval of the form ]—ov, a[). 

The limit rules of Proposition 4.3 and the squeeze rule all hold with obvious 
modifications for limits of the kind lim,-,. f(x) and lim,-.-.. f(x). The reader 
should write out the proofs. 

Geometrically, saying that lim... f(x) = ¢, or lim,;+-. f(x) = t, means that 
the line y = ¢ is a horizontal asymptote to the curve y = f(x). 


4.2.8 The Limits oo 


Let f : A > R and let c be a limit point of A. Most often A is an interval and c a 
point in A or an endpoint of A (or both). 


Definition We say that f tends to oo, or has the limit oo as x tends to c, and we 
write lim... f(x) = oo, when the following condition is satisfied: 


For each K there exists 6 > 0, such that f(x) > K for all x € A that satisfy 
0 < |x-—c| <6. 


To define lim,_,. f(x) = —oo we require f(x) < K instead of f(x) > K. 


Geometrically, saying thatlim,_,. f(x) = oo, orlim,_,, f(x) = —oo, means that 
the line x = c is a vertical asymptote to the curve y = f(x), although in practice 
we usually speak of an asymptote when the limit is one-sided, for example, when 
limy-_ f(x) = oo (Fig. 4.3). 

The reader should supply definitions for the notion lim,_,.. f(x) = oo, and three 
other similar ones obtained by inserting minus signs. 

Although one says “tends to infinity” one never says “converges to infinity”, the 
verb “converge” or the adjective “convergent” always implying a finite limit. Some 
say “diverges to infinity” in the cases limp ad, = 00 or lim,- f(x) = oo. The 
elements oo and —oo are not numbers, but they can be limits. So one has to be careful 
about saying “The limit lim,_,.. f(x) exists”, always adding “and is a finite number.” 
if that is necessary for clarity. 
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Fig. 4.3. Asymptotes as y 
infinite limits 
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limg+e+ f(x) = 00 


limg—yoo f(x) =t 


4.2.9 Exercises (cont’d) 


7. Find the following limits or explain why they do not exist: 


(a) limx?+x?74+x4+1 
x1 


. ol 
im: 

lim — 
oO hy 
@) | | 

im 

x>l x—-—1 

-1 

(e) lim 

x>l x—-—1 
(f) lim /x 

xX—>0O 
(g) lim Vx +1-— Sx 


(h) lim J/x(x +1) —x 


CO 
Gd) lim W2@+D—<x 
P er 8! 
G) lim min(x, 2 — x). 


8. Prove that lim,_, 5 x? 


—x? +x — 1 = oo by showing how to find M, given K, 


such that x? — x7 -++ x —1> K forallx > M. It does not have to be the best M. 
9. Let f be a function whose domain includes an interval of the form Ja, co[. Show 
that lim, _... f(x) = lim;_,9, f(1/f), in the sense that if either limit exists then 


so does the other and they are then equal; the limits 4 


too are allowed. This result, 


although simple, is used so much that it is worth pointing it out. 

10. Another often used device is the following. Let f be a function defined on an 
interval A that contains 0, except possibly at 0 itself. Let A 4 0. Show that 
limy9 f(x) = lim,.9 f(Ax). Again we imply that if one limit exists then so 


does the other. 
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Note. This is really an example of a composite function, where f(x) is composed with the 
function Ax, and is capable of countless variations. 

11. Many readers may have been introduced at school to an important example of a 
limit, including a proof (of a sort) using the squeeze rule. We refer to the limit 


sin x 

lim —— = 1. 

x>-0 X 
It is essential here that the angle is measured by the arc of the unit circle that 
it sweeps out (that is, the angle is given in radians). The limit implies that x is 
an approximation to sinx when x is small. It is surprisingly good; for example 
sin 0.1 = 0.0998 (0.1 rad, about 6°, is really not so small). The explanation for 
this unexpected accuracy is to be found in the power series expansion of sin x, 
studied in Chap. 11. 
The proof of this limit offered in school mathematics is in two steps. In the first 


the inequalities 
7 sin x 
sinx <x < 


COS X 


are established for 0 < x < 2/2, using a geometrical argument involving com- 
paring the areas of three plane figures. 

Complete the proof of the limit, assuming these inequalities and the continuity 
of cos x. 


4.2.10 Bounded Functions 


Let f : A > R, where A is an arbitrary subset of R. We define the following notions 
of boundedness for a function, paralleling those for sets and for sequences: 


(a) The function f is said to be bounded above, if there exists K, such that f(x) < K 
for all x € A. 

(b) The function f is said to be bounded below, if there exists K, such that f(x) > K 
for all x € A. 

(c) The function f is said to be bounded, if it is both bounded above and bounded 
below. It is equivalent to saying that there exists K > 0, such that | f(x)| < K 
for all x € A. 


If f is bounded above we set 
sup f := sup{ f(x): x € A}. 
If f is bounded below we set 


inf f := inf{ f(x): x € A}. 
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More generally, if B C A we denote the supremum (or infimum, with obvious 
changes) of f in B by 


sup f, or sup f(x), or sup f(x); 
B 


xeEB a<x<b 


the last if B is the interval [a, b]. There are many variations on these notations, more 
or less self-explanatory. 

If f is continuous at xo then there exists h > 0, such that f is bounded in the set 
]xo —h, x9 + h[ 1 A. Because of this we say that a continuous function is locally 
bounded. 


Exercise Prove this claim. 


4.2.11 Monotonic Functions 


The notions of increasing and decreasing for functions parallel those for sequences. 


Definition A function f : Ja, b[ > Ris said to be increasing if, for all s and t such 
thata <s <t <b, wehave f(s) < f(t). Itis said to be decreasing if, for all s and 
t such thata <s <t <b,wehave f(s) > f(t). A function that is either increasing 
or decreasing is said to be monotonic. 


As for sequences, we shall speak of a strictly increasing, or strictly decreasing, 
function when the inequalities are strict (that is, when equality is ruled out). 

The terms “monotonic ” and “monotone” are completely equivalent. It is a matter 
of taste, or even ease of speech, which one uses. 


Proposition 4.5 Let f : Ja, b| ~ Rbeanincreasing function that is bounded above 
or a decreasing function that is bounded below. Then lim,_,, f (x) exists and is a 
finite number (not +00). The conclusions also holds if b = oo. 


Proof Suppose that f is an increasing function bounded above. There exists K > 0, 
such that f(x) < K for allx € Ja, b[. The set M of all values taken by the function 
f is bounded above. Let t = sup M. We shall show that lim,_,, f(x) =f. 

Let ¢ > 0. By the definition of supremum, there exists x; € Ja, b[ such that 
t—e< f(x). Since f is increasing and bounded above by f, we must have 
t—e < f(x) < t forall x in the interval ]x,, b[. We conclude that lim,_,, f(x) = t. 
For example, we can take 6 = b — x. 

If b = ~& a similar argument works. The case when / is decreasing is similar, 
using infimum instead of supremum. 


Obviously similar conclusions hold for the limit lim,_., f(x). Furthermore it 
should be clear that if f is increasing, but not necessarily bounded above, then 
the limit lim,» f(x) exists if we allow oo as a limit; and similarly a decreasing 
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function not necessarily bounded below approaches a limit if we allow —oo. With 
this understanding we can say that a monotonic function approaches limits at both 
ends of its interval of definition. 


4.2.12 Discontinuities of Monotonic Functions 


Let f : Ja, b| — R be an increasing function and let a < c < b. By Proposition 4.5 
the one-sided limits lim,_,-_ f(x) and lim,_,.+ f(x) exist, and 


f(c-) = Jim f(x) < f© < jim fx) = f(et). 


If f(c—) = f(c+) then f is continuous at c, but otherwise f is discontinuous at c 
and has an upward jump discontinuity. The difference f (c+) — f(c—) is called the 
height of the jump. Similar conclusions hold for decreasing functions. 

It appears that a monotonic function can only fail to be continuous by having 
a jump, upwards for an increasing function, downwards for a decreasing one. As 
a corollary we can conclude that a monotonic function f : [a,b] — R, that takes 
all values between f(a) and f(b), or, as we might say, has no gaps in its range, is 
continuous. A rigorous proof of this is illustrated in Fig.4.4. The converse is also 
true, a monotonic, continuous function has no gaps in its range. This is a simple 
consequence of the intermediate value theorem that we consider in detail later. 

Most functions in practical applications are monotonic, or are increasing and 
decreasing piece-wise, switching between increasing and decreasing on successive 
intervals. So the commonest discontinuities are jumps. But it is easy to give an 
example of a function that is discontinuous without having a jump; the reader may 
consult the exercises. 


Fig. 4.4 A monotonic FO) letacntec tee ee 
function with no jumps is Given yo and € 
continuous at xo points c and d exist 


by “no jumps” 


6 = min(xg — c,d — 29) 
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4.2.13 Continuity of /x 


The function f : ]0, cof > R, f(x) = x is defined in such a way that ./x is the 
unique positive real number y that satisfies y7 = x. 


Exercise Let B = {tf ¢ R: t > Oand?? < x} and let y = sup B. Show that y satis- 
fies y? = x and is the only positive solution. 


It is easy to see that the function x? is increasing on the domain ]0, oo[ and the 


same is true for ,/x. We show next that ./x is continuous at the point c, where c > 0. 
Let ¢ > Oand reduce «, if necessary, so that e < ./c. Because ./x is an increasing 


function we see that if 
G/e—eyP <x <t/e+eyY 


then 


Jeo-—e<J/x<JScte. 


We can therefore specify 5 as follows: 
8 =min((/c + €)? —c,c — (Ve —8)’). 


Actually the continuity of ,/x can be viewed in the light of the fact that an 
increasing function without jumps is continuous, as was pointed out in the previous 
section. The function ./x has no jumps because every positive real number is the 
square root of another positive real number, namely that of its square. 


4.2.14 Composite Functions 


Let f: A> Rand g:B—R. If f(A) C B we may compose the functions to 
obtain a function from A to R: 


sof: AR, (go ff@)=s(f@)), xEA. 


Figure 4.5 illustrates a way to visualise the composition of functions, using the notion 
of a function as an assignment. 
We have a new continuity rule. 


Proposition 4.6 If x9 € A, f is continuous at xo and g is continuous at f (xo), then 
go f is continuous at xo. 


Proof Set yo = f (xo). Lete > 0. Since g is continuous at yo there exists 6; > 0, such 
that if |y — yo| < 6; and y € B then |g(y) — g(yo)| < &. But since f is continuous 
at xo there exists 6 > 0, such that if |x — x9| < 6 and x € A then | f(x) — f (xo) 
< 6,. This implies that |g(f(x«)) — g(f(xo))| < ¢ for all x € A that satisfy |x — xo 
<6. 
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Fig. 4.5 A view of the 
composition g o f 


This continuity rule greatly increases our stock of continuous functions. As an 
example, /1 — x? is continuous on its domain [—1, 1]. 

When taking a limit of a composite function, lim,.,(g o f)(x), given the two 
premises that lim,_,, f(x) = b and lim,_,, g(y) = ¢, one has to be careful because 
for the second premise the value of g(b) is immaterial. So in general the expected 
result, that the limit is t, can only be obtained on the assumption that there exists 6, 
such that f avoids the value b for all x that satisfy 0 < |x — a| <6. 


4.2.15 Limits of Functions and Limits of Sequences 


We begin to explore the important role of sequences in considerations involving 
continuous functions. 


Proposition 4.7 Let f : A— R. Let c be a limit point of A and assume that 
lim, .¢ f(x) = t. If (xn) ?2, is a sequence in A \ {c} such that lity Xn = c, then 
limy +00 f (Xn) =. 


Proof Let ¢ > 0. There exists 6 > 0, such that | f(x) — t| < e for all x in A that 
satisfy 0 < |x — c| < 6. There exists a natural number JN, such that |x, — c| < 6 for 
alln > N. But x, 4 c, so that for alln > N we also have | f(x,) —t| <e. 


It is interesting to ponder the question as to whether, given that c is a limit point of 
A, there must always exist a sequence (x,)°°, in A \ {c} such that limy_.o0 Xn = c. 
In fact such a sequence always exists. It is a consequence of an axiom of set theory: 
the axiom of choice. This is a set-building axiom that is used to produce sequences 
in some cases when an assignment cannot be specified by any explicit procedure and 
requires an infinite number of arbitrary choices. In the cases which interest us, for 
example when A is an interval, or a finite union of intervals, it is not needed, as the 
existence of the sequence can be seen by an explicit procedure (though this may not 
be very obvious). For this reason we consider the axiom of choice to be beyond the 
scope of this text. 

Proposition 4.7 has an important and much used consequence. We allow the reader 
to elucidate its proof. 
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Proposition 4.8 Let f be continuous in its domain A, let c € A and let (x;,)°-_, be 
a sequence in A such that impo Xn = c. Then imy+oo f(%n) = fc). 


Note that in the proposition we do not have to assume that x, is never equal to c. Nor 
that c is a limit point of A, though if it is not, then the proposition has little practical 
value. 


4.2.16 Iterations 


Suppose that the function f is continuous in its domain A and that a sequence (x,)°° , 


in A satisfies x,41 = f(x,) for each n. Assume that the limit lim,_... x, exists, is 
equal to f and that f lies in A. Then by Proposition 4.8 we have f(t) = f. 

Sequences are often defined in this fashion, called iterating the function f, but 
it is really just a simple instance of inductive or recursive definition. Iterations are 
commonly used to solve the equation f(x) = x. For the method to succeed, two 
things are needed which can be tricky to check: 


(a) The initial point x; must be chosen in such a way that each term of the sequence 
X, 1s in A. We do not want x, to land outside A for then we cannot continue 
the sequence with x,+1. In short we must choose x, so that the whole sequence 
(xn )P2, exists. 

(b) The limit lim, _,,, x, should exist and it should lie in A. 


In the chapter on sequences we saw that a positive sequence can be defined 


recursively by 
ag = 1, Antt = V2+4n, 


and that a, converges to a limit tf. Now we see, thanks to the continuity of the function 
./x, that ¢ must satisfy t = ./2 + ft, so that in fact t = 2, as expected. 
Another example of such an iteration is 


=1 ntl = 1 : 
x} > Xn+1 oF a age 


The limit is /2, the unique positive root of 1 + 1/(1 +x) = x. 

Yet another, and quite important, example is furnished by Newton approximations. 
These are widely used to approximate solutions of equations that cannot be obtained 
in closed form. The scheme 


1 2 
x, =1, ca mae) ce , 


n 


converging to /2 (sometimes called the Babylonian method for calculating /2 and 
apparently known to the ancients) results from applying Newton’s method to the 
equation x” — 2 = 0. Newton approximations will be studied in Chap. 5. 
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4.2.17 Exercises (cont’d) 


12. 


13. 


14. 


15. 


16. 


17. 


18. 


19. 


Write out a nice proof of the claim made earlier in this section that a monotonic 
function with no gaps in its range is continuous. 

Let f and g be continuous functions with the same domain A. Define the func- 
tions max(f, g) and min(f, g), with domain A by 


max(f, g)(x) = max(f(x), g(x)),  min(f, g)(x) = min(f (x), g(x). 


Show that max(f, g) and min(f, g) are continuous. 

Hint. You can use the definition of continuity, or else you can express these 
functions in terms of other functions and use limit-rules. 

Let f : R > Rbedefined by f(x) = sin(1/x) ifx 4 Oand f (0) = 0. Show that 
f is discontinuous at x=0, but the limits lim,_,9+ sin(1/x) and lim,_,9_ sin(1/x) 
do not exist. 

Note. The function sin x will be familiar to the reader from school trigonometry. It will be 
rigorously defined later but for now all we need to know is that sin x is continuous and periodic 
(with period 277, but that is not important), its maximum is | and its minimum — 1. 

Let f : R > R be defined by f(x) = 1 if x is rational and by f(x) = Oif x is 
irrational. Show that f is everywhere discontinuous. 

What can you say about the continuity or otherwise of the function f, given by 
f(x) = x for rational x and f(x) = 0 for irrational x? 

Show that the function f defined by f(x) = x sin(1/x) forx # Oand f(0) =0 
is everywhere continuous. 

Let f be the function with domain ]0, 1[ defined by the following prescription: 
if x is irrational then f(x) = 0; if x is rational then f(x) = 1/b, where b is the 
denominator of x when it is expressed as a fraction in lowest terms. Show that 
for all c in the domain we have lim,-,. f(x) = 0. Deduce from this that f is 
continuous at each irrational x, but discontinuous at each rational x. 

For each real number x we denote by [x] the highest integer n, such thatn < x. 
Define the function f : [0, co[—> R by 


fO=0) fQ= 2, ESO). 
n=[1/x] 


(a) Show that f is increasing. 

(b) Show that f is continuous at all points x > O except at those of the form 
x = 1/n for somen € N,. 

(c) Calculate the jump of the function at x = I/n. 

(d) Which of the following is true: 
@ lim,_,1_ f(x) = F(Z)? 


Gi) lim, 1, f(@) = f(t)? 
(iii) Neither? 
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20. 


21. 


22. 


23: 
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A function with domain all of R is called periodic if there exists a number k 4 0, 
such that f(x +k) = f(x) for all x. A non-zero number k with this property is 
called a period of f. 


(a) Let k, and kp be periods and let m and n be integers (not necessarily positive). 
Show that mk, + nkz is either 0 or else it is a period. 

(b) Suppose there exists a lowest positive period T. Show that the set of all 
periods is precisely the set of non-zero multiples nT as n ranges over the 
non-zero integers. 

(c) Suppose that f is periodic but there is no lowest positive period. Show that the 
set of periods has the following property: every open interval Ja, b[, however 
small b — a is, contains a period. 

(d) Suppose that f is periodic, continuous, and that there exists no lowest positive 
period. Show that f is constant. 

Note. A lowest positive period, if it exists, is called the fundamental period. Compare this 

exercise with Sect. 2.5 Exercise 3. 

Let a, = sinn forn = 1, 2, .... Show that for every ¢ in the interval [—1, 1] there 

exists a subsequence (ax, )°° , that converges to f. 

Hint. Use your knowledge of sinx from school mathematics. In particular it 

oscillates between —1 and | with period 277. In addition you will need two facts 

of analysis, to be proved later: sin x is continuous; and z is irrational. See also 

Sect. 2.5 Exercise 3. 

(©) The notions of limit inferior and limit superior can be defined for functions. 

Suppose that f has domain A and c is a limit point of A. We define 


lim sup f(x) := lim ( sup f(x)) 


> h>0+ \yeA,0<|x—cl<h 


lim inf f(x) := lim (inf x). 
x>C f( ) h>0+ \ xEA, 0<|x—c|<h fC ) 
Draw up a list of properties analogous to those stated for sequences in Sect. 3.11. 
Look for opportunities to use them. 
The notions of upper semi-continuity and lower semi-continuity were defined in 
Exercise 6. 


(a) Let (fi)? be a sequence of functions and suppose that they are all upper 
semi-continuous at x9. Define the function g by g(x) = infj<peoo fx(X). 
Show that g is upper semi-continuous at xo. The same holds if “upper semi- 
continuous” is replaced by “lower semi-continuous” and “inf” by “sup”. 

(b) Find an example of a sequence (f,)7°, of continuous functions such that 
the function g(x) := infj<<oo fx (x) fails to be lower semi-continuous at at 
least one point. 
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4.3 Properties of Continuous Functions 


In this section we prove a number of general propositions about continuous functions 
using the full weight of the completeness axiom C1. In all cases it is the existence of 
supremum and infimum of bounded non-empty sets that is exploited. These propo- 
sitions flaunt the success of analysis and show that axiom C1 and the definition of 
continuity, which took so long to evolve to their present forms, are correct. 

It might be a good idea for the reader to reprise the contents of Sect. 2.4, particularly 
the paragraphs under the heading “Using supremum and infimum to prove theorems”, 
before reading on. 


4.3.1 The Intermediate Value Theorem 


If a continuous function is defined on an interval and among its values are the real 
numbers y, and y2, then among its values is every real number between y, and yo. 
Its range (the set of all its values) includes the interval with endpoints y; and yp. 


Proposition 4.9 Let f : [a,b] — R be continuous and suppose that n is a real 
number between f (a) and f(b); in other words we suppose that f(a) <n < f(b) 
or f(b) <n < f(a). Then there exists t € ja, b[ such that f(t) = n. 


Proof We preface the proof by pointing out how continuity is used in it. Let a < 
t < b and assume that f(t) < 7. We can deduce from this that there exists 6 > 0, 
such that f(x) < n for all x in the interval ]t — 5, t + 6[. For there exists e > 0 such 
that f(t) + e < n. By the continuity of f there exists 6 > 0, such that 


f@-e<fa)<fOte 


for all x in the interval ]t — 5,t + 6[. In particular for such x we have f(x) < 7. 
In a similar way, if f(t) > 7 there exists 6 > 0, such that f(x) > n for all x in the 
interval |f — 6,¢+ d[. 

Similar considerations are valid for the endpoints. For example if f(a) < 7 then 
there exists 6 > 0, such that f(x) < 7 for all x in the interval [a, a + 6[. 

Let us prove the proposition on the assumption that f(a) < n < f(b). Let A be 
the set of all x in [a, b], such that f(s) < n forall s in [a, x]. The set A is bounded (it 
is a subset of [a, b]) and is not empty (it contains a). Let t = sup A (the supremum 
exists by Proposition 2.3). We shall show that f(t) = n. 

We show first that ¢ lies in the open interval Ja, b[. By the considerations of the first 
and second paragraphs, since f is continuous and f(a) < 7 there exists 6; > 0, such 
that f(x) < n for all x in [a,a + 6]. Hence t > a+ 6,. Since f is continuous and 
f(b) > 7 there exists 52 > 0, such that f(b — 52) > n. We deduce that t < b — 4. 
For if ¢ was strictly above b — 52, there would be an element of A above b — 52 and 


112 4 Functions and Continuity 


we would have f(b — 62) < n, contrary to the definition of 52. From these arguments 
we conclude that a + 6, < t < b — 4g, that is, ¢ is in the open interval Ja, D[. 

We shall eliminate the possibilities f(t) < n and f(t) > 7, by showing that each 
leads to a contradiction. This will prove that f(t) = 7, as required. 

Assume first that f(t) < 7. Then there exists 6 > 0, such that f(x) < 7 for all x 
in the interval ]t — 6, t + 6[. But there exists s € A such that t — 6 < s < t (because 
t = sup A) and therefore f(x) < n forall x in[a, s]. We deduce that f(x) < n forall 
x in [a, t + 6[ so that A contains a number strictly higher than rt. That is impossible 
since ¢ is an upper bound of A. 

Assume next that f(t) > n. Then there exists 6 > 0, such that f(t — 6) > . This 
implies that t — 6 is an upper bound of A, for otherwise A would have an element 
strictly above ¢ — 6 and we would have f(t — 6) < n, contrary to the definition of 6. 
But it is impossible that t — 6 is an upper bound of A, since f is its lowest upper 
bound. 

Both assumptions, that f(t) < 7 and that f(t) > , have led to contradictions. 
Finally we conclude that f(t) = n. 

The case f(a) > n > f(b) is handled in a similar fashion; or else the former 
conclusion can be applied to the function — f. 


4.3.2 Thoughts About the Proof of the Intermediate Value 
Theorem 


We cannot prove the intermediate value theorem without using axiom C1 (or some- 
thing equivalent to it like the existence of the least upper bound). The theorem is not 
valid in Q. The equation x7 = 2 has no solution in Q although 1? < 2 < 2”. 

The set A used in the proof can be replaced by the set B of all x in [a, b], such that 
f(x) < n. Thent := sup B is a solution of f(x) = n, by an argument very similar to 
that used to prove Proposition 4.9. The difference is that this solution is the highest 
one in the interval whilst the solution given in the proof is the lowest. Of course they 
can be the same solution. 

Another proof can be given using what is called the method of bisection. This is 
based on the fact that an increasing sequence that is bounded above is convergent, 
itself a consequence of axiom C1. We shall meet this method in the next proposition. 

The property of continuous functions encapsulated in the intermediate value the- 
orem was long considered a possible defining feature of continuity. We shall say that 
a function defined on an interval A has the intermediate value property if it satisfies 
the following condition: 


For all a and b in A such that a < b, if n is such that f(a) <n < f(b) or 
f(a) >n > f(b), then there exists x, such thata <x < band f(x) =n. 


The intermediate value property seems to assert that the graph of the function is 
in one piece; it can be drawn without lifting the pencil from the paper. However, 
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we have to give up any idea of using this property as an alternative definition of 
continuity, despite its appealing character. It turns out that there are discontinuous 
functions that have the property. See the exercises in this section. 

Finally, do not confuse the intermediate value theorem with the mean value the- 
orem of calculus. 


4.3.3 The Importance of the Intermediate Value Theorem 


The intermediate value theorem is very powerful. It gives convenient conditions for 
concluding that an equation f(x) = 7 has a solution. It does not say that the solution 
is unique, and indeed, multiple solutions can exist. However, additional arguments 
can imply uniqueness. 

Solving an equation is acommon problem of applied mathematics, perhaps even 
the commonest, and so a theoretical proof that a solution exists is useful. Frequently 
one has to resort to an approximation method that in a sequence of steps produces 
more and more correct decimal digits of a solution. It is important to know that 
there is a number there that is being approximated, and it is this that the theorem 
guarantees. 

As an example of the theorem in action, we can let n be a positive integer greater 
than or equal to 2, and deduce that every positive real number has a unique positive nth 
root. For let f : [0, co[— R be the function f(x) = x”, and let c > 0. A solution 
of x” =c exists in the interval ]0, oo[ because firstly, f is continuous; secondly, 
Ff (O) = 0; and thirdly, as lim, x” = oo there must exist b such that f(b) > c. The 
solution is unique because / is an increasing function, more precisely, if0 < x; < x2 
then x} < x3. 

This defines at a stroke the function ./y that assigns to each positive y its unique 
positive nth root. 


4.3.4 The Boundedness Theorem 


Continuous functions defined on bounded and closed intervals are bounded. This is 
the content of the boundedness theorem. 


Proposition 4.10 Let f : [a,b] — R be continuous. Then f is a bounded function. 


Proof We assume that f is unbounded on [a, b] and derive a contradiction. The 
argument, based on the method of bisection, though intuitive, is rather long; so some 
patience may be required. For this reason we first give a rough description of the 
proof. 

If we divide the interval [a,b] into two equal parts the function f must be 
unbounded on at least one of them. We can therefore choose one of the two intervals, 
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such that f unbounded on it. The length of the new interval is half that of the orig- 
inal interval. Now we repeat this procedure, divide the new interval into two equal 
parts and choose one part, such that f is unbounded on it. Repeating this procedure 
we obtain a sequence of intervals, with length that tends to 0, and such that f is 
unbounded on each. The intervals shrink down to a single point t. We obtain a con- 
tradiction because f is continuous at ¢ and is therefore bounded on an interval of the 
form ]t — 6,¢f+ 6[. 

In this rough description it may not be clear how the completeness axiom is needed; 
so we proceed to describe the proof more precisely. Recall that f is supposed to be 
unbounded on [a, b] and from this we wish to obtain a contradiction. 

Set a; = a and b; = b. Let c; be the midpoint $(a + b,). Then f is unbounded 
either on [a,, c;] or on [c;, b;] (or on both; that is not excluded here). Let a. = ay, 
by = c, if f is unbounded on [a,, c;], and let ay = c}, bo = b, if f is bounded on 
[a1, ci] (because then f is unbounded on [c,, b;]). In both cases a; < az < by < bh, 
the function f is unbounded on the new interval [a2, b2], and the length of [a2, b2] is 
half the length of [a;, b;]. We repeat this step and obtain an interval [a3, b3] which 
is either [a2, C2] or [c2, b2], where co = 5 (a2 + b2), and f is unbounded on [a3, b3]. 
Furthermore a, < a2 < a3 < b3 < bo < Dy. 

From this procedure there result two sequences in the interval [a, b], an increas- 
ing sequence (a,)°° ,, and a decreasing sequence (b,)°° ,. Moreover a, < b,, and 
because by41 — Qn41 = 5 (Dn — dy) we have b, — a, — 0. Finally f is unbounded 
on [a,, b,] for eachn. By Proposition 3.3 concerning bounded, monotonic sequences, 
the proof of which depended on the completeness axiom, both sequences are con- 
vergent, and since b, — a, — 0 they have the same limit t, which lies in the interval 
[a, b] (the reader should check the last claim; see, for example Sect. 3.2 Exercise 9). 

Now f is continuous at t. So there exists 6 > 0, such that | f(x) — f(t)| < 1 
for all x in [a, b] that satisfy |x — t| < 6. For such x we have | f(x)| < | f()| + 1, 
so that f is bounded on the set [a, b] N ]t — 6,¢ + d[. But we know that a, and b, 
both converge to t. Hence there exists N, such that a, and b, both lie in the interval 
jt — 6,t+6[forn > N; and this is the same as saying that the interval ]t — 5, t + 6[ 
includes the interval [a,, b,] for alln > N. But then f is also bounded on [a,, by]. 
This is a contradiction since we chose a, and b, so that f was unbounded on 
lan ’ Dn ] : 


It is essential for the general validity of the boundedness theorem that the domain 
is abounded and closed interval. The boundedness theorem does not hold on intervals 
of other kinds. 


Exercise For each type of interval A, except the bounded closed interval, and 
the empty interval, find an example of a unbounded, continuous function f with 
domain A. 
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4.3.5 Thoughts About the Proof of the Boundedness Theorem 


Continuity was not used in its full strength in the proof of the boundedness theorem; 
it only mattered that f was locally bounded (which is a consequence of continuity). 
Local boundedness is the property that for each xo, there exists 5 > 0, such that f is 
bounded in the set |x — 6, x9 + 6[./N [a, D]. 

It is also possible to prove the boundedness theorem using a method similar to 
that used for the intermediate value theorem. Let A be the set of all x in [a, b], such 
that f is bounded on [a, x]. Then setting t = sup A, one proceeds to show that t = b 
andt € A. 


Exercise Write out the details of the proof of the boundedness theorem suggested 
in the previous paragraph. 


By the same token it is possible to prove the intermediate value theorem by the 
method of bisection. Just divide the interval into two equal parts and choose the 
one for which f(x) — 7 has a different sign at the endpoints and continue. This is a 
practical method for approximating a solution. If the interval is [0, 1] we can divide 
into 10 equal parts, and repeating the process obtain a decimal expansion of one of 
the solutions. 


Exercise Write out the details of the proof of the intermediate value theorem sug- 
gested in the previous paragraph. 


A rather short proof of the boundedness theorem can be based on Proposition 
3.10, the Bolzano—Weierstrass theorem, which states that a bounded sequence has 
a convergent subsequence. Here is a sketch of it, omitting some subtle set-theoretic 
details. Suppose f is unbounded. Then there must exist a sequence (x,)° , in [a, b] 
such that the sequence (f(x,))°2, is unbounded. By Bolzano—Weierstrass there is 
a convergent subsequence (x,,)72, Of (%)P2,, say with limit ¢. Next it is shown 
that ¢ is in the interval [a, b] (important here that the interval is closed; see Sect. 3.2 
Exercise 9), so that f(x,,) — f(t) by continuity, whilst at the same time f (x;,,) is 
unbounded, which is a contradiction. 

The sketched proof just given is not just an academic curiosity. The Bolzano— 
Weierstrass theorem is capable of great generalisation, into the area of multivariate 
calculus, and even beyond, into the realm of infinite-dimensional spaces. It means 
that versions of the boundedness theorem, and the extreme value theorem of the next 
section, emerge repeatedly in advanced work. 


4.3.6 The Extreme Value Theorem 


A continuous function defined on a bounded and closed interval is not just bounded. 
It attains a maximum and a minimum. This is the extreme value theorem. 
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Proposition 4.11 Let f : [a,b] — R be continuous. Then there exist c, in [a, b], 
such that f (x) < f(c1) for all x in [a, b]; and cp in [a, b], such that f(x) > f(c2) 
for all x in [a, b]. 


Proof We know that f is bounded in [a, b]. Let M = sup,-,-, f(x) (that is, M 
is the supremum of the set of all values of f). Assume that a does not attain a 
maximum. Then f(x) < M for all x in [a, b]. The function g(x) := 1/(M — f(x)) 
is then continuous in [a, b] (since the denominator is nowhere 0). However g cannot 
be bounded; however large we choose K there exists x € [a, b] such that f(x) > 
M — 1/K (because M is the supremum of f) and then g(x) > K. This contradicts 
the boundedness theorem because g, being continuous in [a, b], must be bounded. 
We conclude that there exists x in [a, b] such that f(x) = M; that is, f attains a 
maximum value in [a, b]. 

Similar arguments show that f attains a minimum value. 


We sketch a second proof of the extreme value theorem based on Bolzano— 
Weierstrass, ignoring again some subtle set theoretical details. We know that f 
is bounded; so let M = sup f. For each positive integer n there exists x, in [a, b], 
such that f(x,) > M — 1/n. The sequence (x,,)°°., has a convergent subsequence, 
say, (x;,)p2.,. Let its limit be t. Then ¢ € [a, b] and by continuity of f we have 
S@) = limynoo f(%,) = M. We conclude that f(t) = M. A similar argument 
shows that the infimum of f is also attained. 

The argument of the last paragraph can even be modified to prove the bounded- 
ness theorem and the extreme value theorem simultaneously. We let M = sup f and 
rewrite the last paragraph to allow the a priori possibility of M = oo. The conclusion, 
that f(t) > M, shows at once that M is finite and is attained. All of this is capable 
of much generalisation. 


4.3.7 Using the Extreme Value Theorem 


Seeking the maximum or minimum of a function is a common problem of applied 
mathematics. Just as the intermediate value theorem can justify that what is sought, 
a solution of an equation, actually exists, so also the extreme value theorem can 
guarantee that what is sought, a maximum value or a minimum value, actually exists. 

The limitation of the extreme value theorem to a bounded closed interval can 
sometimes be overcome. It is often possible to gain some knowledge of the maximum 
or minimum of a continuous function f(x) on an unbounded interval, if we can 
control the function as x — 00. 

As an example we suppose that f is continuous on all of R, and that lim,_.., f(x) 
= limy-._.« f(x) = 0. Ifnow f takes a positive value somewhere, then it must attain 
a maximum. For suppose that f(a) > 0. We can find K, such that for |x| > K we 
have | f(x)| < f(a). But then the maximum value of f on the interval [—K, K], 
which is attained by the extreme value theorem, is the maximum value of f on all 
of R. This argument is easily adapted to different cases. 
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4.3.8 Exercises 


1. Let f(x) =x" + An_\x"! +++» + ax +ao be a polynomial function with 
leading coefficient 1, and with odd degree n. 


(a) Show that lim,_,.. f(x) = oo and lim, ,_. f(x) = —o. 
(b) Show that the equation f(x) = y has at least one real root for every y. 


2. Let f(x) =x” + ay_1x""! +--»+a,x +a be a polynomial function with 
leading coefficient 1, and with even degree n. 


(a) Show that lim,;... f(x) = limy.-1» f(x) = @. 

(b) Show that the function f(x) attains a minimum value m at some point. 

(c) Show that for y > m the equation f(x) = y has at least two solutions in R, 
whereas for y < m it has no solution. 


3. Prove the following fact, used several times in this section: if A is a closed 
interval, then no sequence in A can converge to a point outside A. 
Note. The interval does not have to be bounded. This is really the same as Sect. 3.2 Exercise 9. 
4, Show that the equation x° — x? + 1 = 0 has a root in the interval —1 < x <0. 
5. Let f : [a,b] > [a, b] be continuous. Show that there exists x in [a, b] such 


that f(x) = x. 

6. Let f be a continuous function defined in an interval A (or, more generally, f 
is a function that satisfies the intermediate value property). Let t), to, ..., t, be 
points in A, and let c1, Co, ..., Cm be positive numbers. Set 


Show that there exists € in A such that f(€) = w. 
7. Consider the function f with domain R given by 


f() =sin (=). (x > 0), f(x) =0, (« <0). 


Show that f is discontinuous but has the intermediate value property. 
Hint. You will need to know that sin x is continuous, periodic (the period is 27 
but that is not needed) and that sin x oscillates between its maximum | and its 
minimum —1. These facts will be established properly in a later chapter. 

8. Let A be an interval and let f : A > R be continuous. 


(a) Show that the range of f (the set of all its values) is an interval. 
Hint. See Sect. 2.4 Exercise 6. 

(b) Suppose that A is closed and bounded. Show that the range of f is a closed 
and bounded interval. 
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9. Let f be continuous and defined in an interval A. A line segment joining two 
points (a, f(a)) and (b, f(b)) in the graph y = f(x), where a and b are distinct 
points in A, is called a chord. As is usual in analytic geometry the slope of the 
chord is the number 

f(a) — Ff) 


a—b 


Let B be the set of all numbers p, such that there exists a chord with slope p. 
Show that B is an interval. 

Hint. Given two chords, the first whose endpoints have x-coordinates aj and 
bg, and the second whose endpoints have x-coordinates a, and b,, consider the 
variable chord whose endpoints have x-coordinates a; and b,, where 


a=) tia, BS =Hbotih, O<t = 1, 


10. Let A be an interval and f : A > R. Show that f has the intermediate value 
property if and only if, for every interval B such that B C A, the set f(B) (the 
set of all y such that y = f(x) for some x € B) is an interval. In short, f has 
the intermediate value property if and only if it maps intervals to intervals. 

11. (©) The notions of upper and lower semi-continuity were defined in Sect. 4.2 
Exercise 6. Let f : [a,b] — R be upper semi-continuous (that is, it is upper 
semi-continuous at all points of its domain). 


(a) Let a <c <b and let (x,), be a sequence in [a, b] such that x, > c. 
Show that 
lim sup fn) < f(c). 


noo 


Note that the limit superior may have infinite absolute value. 

Show that f attains a maximum value in [a, b]. 

Hint. Let M = SUPia,5] f (allowing 00 as a possible value). Let (x, )°° , be a 

sequence in [a, b], such that f(x,) — M. Revisit the last paragraph of the 

section “The extreme value theorem”. 

(c) Obtain similar results in the case that f is lower semi-continuous, replacing 
limit superior with limit inferior, reversing the inequality sign, and conclud- 
ing that f attains a minimum. 


(b 


Ye 


4.4 Inverses of Monotonic Functions 


Let us run over some important concepts about functions. They all contribute to 
understanding the set of solutions of an equation f(x) = y. 

Let f : A — B where A and B are quite arbitrary sets. One may think of f purely 
as an assignment of an element in B to each element in A. The concepts we shall 


4.4 Inverses of Monotonic Functions 119 


define are set-theoretical in nature, although we shall apply them in this text almost 
entirely to functions of a real variable. 


(a) The function f is said to be injective if it maps distinct elements in A to distinct 
elements in B. This is equivalent to saying that if x, and x2 are elements of A, 
and f (x1) = f (x2), then x; = x2. Equivalently, if y is an element of B and the 
equation f(x) = y has a solution, then the solution is unique. 

(b) The function f is said to be surjective, if the equation f(x) = y has a solution 
for all y in B. We also say that f maps A on to B, since every element of B 
appears as a value f(x) for some x in A. 

(c) The function f is said to be bijective if it is both injective and surjective. This is 
equivalent to saying that the equation f(x) = y has a unique solution for every 
yin B. 

(d) Even if f is not surjective we can make it so by restricting the codomain to those 
elements y such that the equation f(x) = y has a solution. These form a subset 
of B called the range of f (already mentioned in Sect. 4.3). A function always 
maps its domain on to its range. 


Given that the function f : A — B is bijective we can define the inverse function 
f |: B— A, also bijective. For each y € B we simply let f~'(y) be the unique 
x in A such that f(x) = y. The important cases in this text are when A and B are 
sets of real numbers. The key observation is that a strictly monotonic function is 
injective. 


Proposition 4.12 Let f : Ja, b| — R be continuous and strictly increasing, allow- 
ing here the possibilities a = —oo and b = oo. Let 


c= lim f(@) and d= lim Ff) 


(again allowing c = —oo or d = & So that the limits always exist). Then the range 
of f is the interval |c,d| and f is a bijective function from Ja, b[ to |c, d[. The 
inverse function 


f~' :]e,d[ >a, bl 
is strictly increasing and continuous. 


Proof Let y € Jc, d[. Then the equation f(x) = y has a solution by the intermediate 
value theorem. To see this let y; and yz satisfy 


c<y<y<y2 <d. 


We have 
(oi jim f@) and d= jim fC). 


Therefore we can find x; and x2, such that 
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a<xj<1x2<b, f(x1)<y1, f(x) > yo. 


So the equation f(x) = y has a solution between x; and x2. The solution is unique 
because f is strictly increasing. Note how the phrasing of this paragraph works just 
as well for infinite values of a, b, c or d, as for finite ones. 
The inverse function 
f-' :]e,d[ > Ja, df 


is therefore defined and is bijective. It is easy to see that f—! is strictly increasing. 
The proof that it is continuous is really the same as the one that we used to prove that 
./x is continuous. Let c < yo < d and x9 = f~!(yo). Then yo = f (x0). We shall 
prove that f—! is continuous at yo. 

Lete > 0. Let y) = f(xo — €) and yz = f (xo + &) (reduce ¢ if necessary so that 
Xo — € and xo + « fall within the interval Ja, b[ ). Now we set 


6 = min(y2 — yo, yo — 1). 
Because of monotonicity, if 
yo-8<y<yotd 


then 


xo -—e< fly) <xote. 


Another proof, perhaps simpler, that f—! is continuous, is based on the observation 
that f—' is monotonic and its range, the interval Ja, b[, is without gaps. Therefore 
f—' has no jumps, and must be continuous as the discontinuities of a monotonic 
function consist only of jumps. 

Proposition 4.12 provides us with the most commonly used tool to produce inverse 
functions in the calculus of functions of one variable. Here are some examples: 


(a) We consider the function f(x) = x” on the domain ]0, oo[ (given that n is a fixed 
positive integer). Then f is continuous, strictly increasing and, as is easily seen, 
maps ]0, oof[ on to ]0, oof. The inverse function, f~'(y) = yh, is continuous 
and maps ]0, oo[ on to itself. It is also written as “/y. This short paragraph could 
replace all our previous lengthy deliberations about the nth root function. 

(b) The function f(x) = sin x is not monotonic but we can restrict it to an interval 
where it is strictly increasing, for example |—z/2, 1 /2[. It maps this interval 
on to the interval ]—1, 1[; so we get an inverse function arcsin y, that maps the 
interval ]|—1, 1[ on to the interval ]—z/2, 2/2[, and is also increasing. 

(c) A result similar to Proposition 4.12 holds for strictly decreasing functions; the 
reader is invited to formulate it. It can be used to produce an inverse function 
arccos y for cos x, using the interval ]0, z[ on which cosine is decreasing. 
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4.4.1 Exercises 


1. Let aw be a rational number. Prove that the function 
fa=x*, 0<x < ow) 


is continuous. 
2. Let A and B be sets and let f : A — B be bijective. Show that 


fof-'=idg, f7'of =id, 


where id, : A > A and idg : B — B are the identity functions: id,4(x) = x 
and idg(y) = y. 

3. For each of the following functions, describe inverse functions, restricting the 
domain if necessary: 


1 
cae. (et) 


(b) f@)=x3+x, GER). 
You may not be able to give a formula here, but describe the domain and 
range of an inverse. 

(c) f(x) =x4+x74+1, (ER). 


4. Let A be an interval, let f be a function with domain A and suppose that f is 
continuous and injective. Prove that f is monotonic. 
Hint. Use the intermediate value theorem. It may be simpler to consider first the 
case A = [a, D]. 


4.5 Two Important Technical Propositions 


The contents of this section, Cauchy’s principle for the limit of a function, and the 
small oscillation theorem, may be read later when they are required. We also meet 
the notion of uniform continuity. 


Proposition 4.13 (Cauchy’s principle) Let f : A —> R and let c be a limit point 
of A. Then the limit lim,-,, f(x) exists and is a finite number if and only if the 
following condition (Cauchy’s condition) holds: for each ¢ > 0 there exists 5 > 0, 
such that | f (x,) — f (%2)| < € for all x; and xz in A that satisfy 0 < |x; —c| <6 
and 0 < |x. —c| <6. 


Proof That Cauchy’s condition is necessary for the existence of a finite limit follows 
by virtually the same argument as was used to prove Cauchy’s principle for sequences, 
Proposition 3.12. 

Let us prove that the condition is sufficient. Assume that Cauchy’s condition is 


satisfied. Let (a,)°° , be asequence in A that converges to c and is such that no term a, 
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is equal to c.! The sequence (f (a,))°., satisfies Cauchy’s condition for sequences, 
and so the limit lim,_, 4 f (an) = t exists and is finite. If (b,,)°° , is another sequence 
in A \ {c} with limit c then the limit lim,_,... f(b,) = s exists by the same token. 
But now we must have s = ft. For we can construct a third sequence with limit c 
whose terms, (d,)°°.,, are taken from (a,)°°., and (b,)°°., alternately. Then the limit 
limy-+oo f (d,) exists, but this is only possible if s = f. 

We see therefore that lim, f(a,) exists, and is finite, for every sequence in 
A \ {c} that converges to c, and the limit ¢ is the same for every such sequence. 

Now we prove that lim,_,. f(x) = t. Let ¢ > 0. There exists 6 > 0, such that 
| f (x1) — f(%2)| < ¢/2 for all x; and x2 in A that satisfy 0 < |x; — c| < 6 and0 < 
|x2 — c| < 6. There exists y in A, such that 0 < |y—c| <6 and|f(y) —t| < ¢/2 
(simply choose a sequence in A \ {c} with limit c, and choose for y a term in the 
sequence sufficiently near to c). If now x is in A and O < |x — c| < 4, we obtain 


E 


a" 


If@) — #1 <1f@) — FOF IFO) -41< 5+ 


as needed. 


Let us display Cauchy’s condition for limits of functions so that the reader can 
better compare it to Cauchy’s condition for limits of sequences, given in Sect. 3.7: 


For each ¢ > 0 there exists 6 > 0, such that | f (x1) — f(x2)| < € for all x, 
and xz in A that satisfy 0 < |x; —c| < 6 and 0 < |xx-—c| <6. 


There is a version of Cauchy’s principle for the limit lim,_,.. f(x), and with 
obvious changes, for lim,-.—o. f(x). Itis very useful for studying improper integrals 
(Chap. 12). In the following, it is reasonable to assume that f is defined in an interval 
of the form Ja, oo[. 


The limit lim,_,. f (x) exists and is finite if and only if the following condition 
is satisfied: for all ¢ > 0 there exists K, such that | f(x) — f(Q)| < € for all 
x and y that satisfyx > K andy > K. 


Exercise Prove the last assertion. 


4.5.1 The Oscillation of a Function 


Let f : [a,b] > R be a bounded function. The difference sup,,.,; f — inf{a.o) f is 
called the oscillation of f on the interval [a, b]. Recall that sup,, ,) f is the supremum 
and inf;,,,; f the infimum of the set { f(x) : x € [a, b]}. 


'It is obvious that such sequences exist if A is an interval, and this is the case in all applications 
considered in this text. For general sets we must appeal to the so-called axiom of choice of set 
theory. 
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More generally we define the oscillation of f on a subset A of its domain as the 
difference: 


Qa, f := sup f — inf f. 
A A 


It is easy to see that f is continuous at the point c if and only if for each ¢ > 0 
there exists 6 > 0, such that the oscillation of f on the set Jc — 6,c + d[ NA is less 
than ¢. The proof is left to the exercises, but we shall use this fact in the proof of the 
small oscillation theorem. 

The following proposition will be needed to prove that continuous functions are 
integrable. 


Proposition 4.14 (The small oscillation theorem) Let f : [a,b] > R be continu- 
ous. Let € > 0. Then it is possible to partition the interval [a, b] with finitely many 
points 

a=t<t <b <:+::+<t,=b 


such that the oscillation of f on each interval [t;, tj41], j =0,1,2,m —1, is less 
than €. 


Proof Let ¢ > 0. Partition the interval into two parts [a,c] and [c, b] using the 
midpoint c = 5(a + b). If, for the given ¢, the conclusion of the proposition is true 
for both the intervals [a, c] and [c, b], then it is true for [a, b]. 

Turn this around and use the method of bisection. We suppose, for the given ¢, 
that the conclusion of the proposition is not true for the interval [a, b]. Then either it 
is not true for [a, c] or it is not true for [c, b], and we choose that interval for which 
it is not true (we can choose the left interval if it fails for both). We repeat this for 
the new interval, and so on. 

We can therefore construct an increasing sequence (a,)°°, and a decreasing 
sequence (b,)°° ,, such that a; =a, bj = b, dy < by, by — ay tends to 0, and for 
each n there is no partition of [a,, b,] into finitely many intervals on each of which 
the oscillation of f is less than e. 

Now (a,,)c°., and (b,)°°., converge to the same limit ¢ in [a, b]. Suppose first that 
a <t <b.Since f is continuous at ¢ there exists an interval [t — h, t + h] on which 
the oscillation of f is less than ¢. But when n is sufficiently large we have t —h < 
an < by < t +h. The interval [t — h, t + h] then includes the interval [a,, b,], so 
that the oscillation of f on [a,, b,] must be less than ¢ also. The conclusion of the 
proposition holds for the interval [a,,, b,] and the given ¢; without even partitioning 
it. This contradicts the definition of the sequences a, and by. 

If t = a we use the interval [a, a + h] instead of [a — h,a +h], and if t = b the 
interval [b — h, b], in the argument of the last paragraph. 
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4.5.2 Uniform Continuity 


There is another form of the small oscillation theorem that is useful. It concerns the 
notion of uniform continuity. Suppose the function f is continuous in its domain A. 
This means it is continuous at each point a in A. Suppose we consider its continuity 
ata. Foreache > 0 there exists 6 > 0, such that for all x in A that satisfy |x —a| <6 
we have | f(x) — f(a)| <. 

If we move to a different point b, instead of a, but keep the same ¢, it is not certain 
that the same 6 will work. But if it does not work we can make it work by decreasing 
it, since we know that some 6 will work for b (and the same ¢ as at a). We can clearly 
look simultaneously at a finite number of points a), d,..., Gm, and find one 6 that 
works for them all (always for the same ¢ as before). We just take the lowest of the 
m values of 6 that we found for the points individually. But to treat infinitely many 
points, for example all points of the domain A, and find one 6 that works for them 
all, may be impossible. Tolkien’s One Ring can rule the others only so long as the 
others are finitely many. 

The upshot of the discussion of the last paragraph is to define the notion of uniform 
continuity. We say informally that a function is uniformly continuous on its domain 
A, if it is continuous in A, and for each ¢ > 0 a single 6 can be found that works for 
all points of A. 

This can be expressed in a different way, which is nicer, and more symmetrical. 
We have to make | f(x) — f(qa)| small only in virtue of |x — a| being small, and 
without pinning down a in advance. We therefore say that a function f is uniformly 
continuous on the domain A if it satisfies the following condition: 


For all ¢ > 0 there exists 5 > 0, such that for all x and y in A that satisfy 
|x — y| < 6 we have | f (x) — f(y)| < e. 


Proposition 4.15 Let f : [a,b] > R be continuous. Then f is uniformly continu- 
ous. 


Proof Let « > 0. There exists a partition of [a, b], such that the oscillation of f is 
less than ¢/2 on each subinterval. Let 5 be smaller than the length of the shortest 
subinterval of the partition. If |x — y| < 6 then either x and y belong to the same 
subinterval, so that | f(x) — f(y)| < €/2; or else x and y belong to adjacent subin- 
tervals, in which case | f(x) — f(y)| <e. 


4.5.3 Exercises 


1. Let f be a function on the domain Ja, b[. We suppose that b is finite. Suppose 
that there exists a constant K, such that for all x and y in the domain we have 
f(x) — fO)| < K|x — y|. Show that lim,;.,— f(t) exists and is a finite 
number. 

Note. The condition on f is called a Lipschitz condition. It is a strong version of continuity. 
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2. 


3. 


Prove that if a function f satisfies a Lipschitz condition (see Exercise 1) on a 
domain A then it is uniformly continuous on A. 

It is important for the validity of Propositions 4.14 and 4.15 that the domain of 
f is aclosed and bounded interval. For each type of interval A, that is not closed 
and bounded, find an example of a continuous function with domain A that is 
not uniformly continuous. 

Prove that Q4 f = sup, ye, | f(x) — f(y), where the supremum is taken over 
all pairs of points x and y in A. This characterisation of oscillation is often useful. 


In the remaining exercises it is simplest and most useful to assume that the domain 
A is an interval. 


23 


Prove that a function f is continuous at a point c in A if and only if the 
following condition is satisfied: for all « > 0 there exists 56 > 0, such that 


Qhe—s,c4 tn AF < €. 
Given that f is a bounded function, show that for each c in A the limit 


wf (c) = ans Qie—ncthpn af 


exists and is a finite number. This enables us to define the function w,, called 
the point oscillation of f. 


The point oscillation wr of a function f is studied in the next three exercises. Assume 
that the domain of f is an interval. 


7. 
8. 


Show that f is continuous at c if and only if ws (c) = 0. 

Show that if the left limit lim,.._ f(x) exists and is a finite number then 
lim,.-_ w(x) = 0. Show that the converse is false. (Similar results hold for 
the right limit.) 

Give an example to show that the function wy is not necessarily continuous. 
However, it is upper semi-continuous. This means that for each point c, and for 
each e > 0, there exists 6 > 0, such that w¢(x) < wy¢(c) + € for all x that satisfy 
|x —c| < 6. Prove this. 


4.6 (©) Iterations of Monotonic Functions 


We present some simple, practical and general conclusions about iterations. The 
main object is to exploit continuity and monotonicity to compute a solution of a 
fixed point problem f(x) = x, given that we know that the solution exists. Further 
developments (such as the study of convergence rates) require derivatives and will 
be taken up later. 


Let f : A > R be a continuous function, where A is a subset of R. We assume 


that we can define an infinite sequence in A by the iteration scheme a,+4; = f (ay), 
using some initial point aj. We know that if the sequence converges to ¢, and if t is 
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Fig. 4.6 Picture of the 
proof. Iterating an increasing 
function; the case ay < a2 


ay a as t x 


in A, then ¢ is a solution to f(x) = x. In practice an iteration scheme such as this is 
used to approximate a solution to f(x) = x. 

The problem is to see whether the sequence (a,)°° ; is convergent. Even if the 
equation f(x) = x is known to have exactly one solution in A, it is not guaranteed 
that the iterations converge. Here are two conclusions that are sometimes useful. 
They are capable of some variation, which increases their practicality, but it is left 
to the reader to explore this. Both depend on monotonicity and continuity of the 
function iterated. 


(i) Let f :]0, co[— R, where f(x) > 0 for all x in its domain and f is continuous 
and increasing. Suppose it is known that the equation f(x) = x has exactly one 
solution t. Then the following conclusions hold. If a, < t and a; < dy», then a, 
is increasing and converges to t; if a, > t anda, > az, then a, is decreasing and 
converges to f. 

(ii) Let f : JO, cof ~ R, where f(x) > 0 for all x in its domain and f is continuous 
and decreasing. Assume that the equation f(x) = x has exactly one solution 
t and that f is also the unique solution of f(f(x)) =x. Then the following 
conclusions hold. If a; < ¢ and a, < a3, then az,_; (the subsequence with odd 
place numbers) is increasing and converges to t whilst a, (the subsequence with 
even place numbers) is decreasing and converges to f. 


Proof of the First Rule In the case a, < t and a, < az it is seen by induction (left to 
the reader to verify) that a, < d,+4; for all n, and a, < t for all n. The sequence a, 
is therefore increasing and bounded above; it therefore converges, to s say. But now 
f(s) = 5s, and since there is only one solution we must have s = t. The case a; > t 
and a, > dp is similar. 


Proof of the Second Rule The second rule follows from the first. Let g = fo f. 
Assume that a; < t. Now g is increasing and ¢ is the sole solution of g(x) = x. The 
sequence b, = d,_1 satisfies b,4; = g(b,). We apply the first rule to g. If by < bo, 
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Fig. 4.7 Picture of the 
proof. Iterating a decreasing 
function; the case aj < a3 


y = f(x) 


1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 : 
+ ae - 


ay agt ag x 


(that is aj < a3) then b, = da2,_; 18 increasing and tends to t. The sequence c, = 
Ay, also satisfies c,+; = g(c,). We see that c; = ay > t since f is decreasing and 
f(t) =t. Moreover cz = a4 = f (a3) < f(a) = a2 = c). The sequence c, = a2, is 
therefore decreasing and tends to the limit fr. 


The proofs of these rules are pictured in Figs. 4.6 and 4.7. 


4.6.1 Exercises 


1. Draw conclusions about the results of iterating the following functions, using a 
positive starting value. 


(a) f@)=J2+x 
(b) f(x) = V2x 


2 
©) fa)= 


(connected with the ratio of successive Fibonacci numbers). 
1 
e = 1+ — 
(e) f(x) ea 


x 
(connected with the continued fraction of the limit; see the nugget “Contin- 
ued fractions’’). 


4.6.2 Pointers to Further Study 


— Numerical analysis 
— Dynamical systems 


Chapter 5 M®) 
Derivatives and Differentiation pectics 


Big fleas have little fleas upon their backs to bite ’em, And little 
fleas have lesser fleas, and so, ad infinitum. 


Augustus de Morgan 


There’s a problem with continuity. Suppose that f is continuous at a point x9. Suppose 
we want to compute f (xo) with an error less than ¢, for example ¢ = 10->, but we 
do not know x9 exactly. We know that there exists 6, such that if |x — x9| < 6 then 
| f(x) — f(xo)| < 10-5. We do not therefore have to know x exactly; a certain 
number of decimal places will suffice. 

But what if 5 needed to be uncomfortably small compared to « in order to achieve 
the desired accuracy? What, for example, if 6 was 10~!°, or 107! or even less....? 
The function may be continuous but continuity does not seem so useful here. 

The problem is that f could be increasing or decreasing very rapidly at the 
point x9. But what does that mean—the rate of increase or decrease of a function at 
a point? 

The concept of the rate of growth of a function at a point is the key to the calculus 
of Newton and Leibniz and is what we call the derivative. As soon as it was introduced 
it became possible to solve important problems in geometry and physics with the 
new calculus, in spite of the fact that an acceptable definition of derivative was not 
given for some 200 years. 


5.1 The Definition of Derivative 


The average rate of growth of a function f between distinct points x9 and x is the 
quotient 


f(x) — fo) 
X — Xo 
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The rate of growth at the point x9 is defined as the limit. 


Definition Let f : Ja, b[ ~ Randa < xo < D. If the limit 


jam LOO ~ LO) _ 
im ~———— = 


xX—>Xo xX — X90 


A 


exists and is a finite number we say that f is differentiable at xp and call A, the 
derivative of f at xo. 


We emphasise that A is a finite number; if the limit is oo or —oo we may sometimes 
say that the derivative is oo or —oo respectively, but we will never say that f is 
differentiable at xo. 

Another version of the definition of derivative, that arises by replacing x — x9 by 
h, is 
im LO +h — Fo) 


li : 
h>0 h 


provided that the limit exists and is a finite number. The quotient appearing here is 
called a difference quotient. It is defined for both positive and negative values of h, 
though not for h = 0, but |h| should not be so big that xo + h falls outside the domain 
of f. 

If f is differentiable at x9 we denote its derivative at xp by f’(xo). We say that the 
function f is differentiable in the interval Ja, b[ if f is differentiable at every point 
of the interval. 

The definition of derivative follows a pattern that we have set in defining limit 
and the sum of an infinite series, and will continue in defining integral. The quantity 
in question that we wish to define does not necessarily exist. The definition of the 
quantity states when it exists, and given that it exists defines its value. Just as it 
is illogical to write lim,., f(x) without first ascertaining whether the limit exists 
(though we often do this), we should not write f’(c) without first ascertaining whether 
f is differentiable at c. 

If f is differentiable in the interval Ja, b[ we get a new function 


f':ja,b[>R,  f’() = derivative of f at x. 


The operation of creating f’ from f is called differentiation of the function f. 


5.1.1 Differentiability and Continuity 


Proposition 5.1 Let the function f be differentiable at the point xo. Then f is 
continuous at Xo. 


Proof We have, for x # xo, 
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f(x) — cm) eas 


X — Xo 


FO) = F060) = ( 
so that, by the rule for the limit of a product, 


f(x) — fo) 


X — Xo 


Jim (£0) = F0%0)) = im ( ) - im (x —%0) = £0) .0= 0. 


In other words 


tim f@) = fo) 


which says that f is continuous at x9. 


Continuity is therefore necessary for differentiability, but it is far from being suffi- 
cient. 


5.1.2. Derivatives of Some Basic Functions 


Now we can begin to differentiate functions from first principles, that is, by applying 
the definition of derivative as the limit of the difference quotient. 


(a) Let f be the constant function, f(x) = C for all real numbers x. Then 


SOF = $0) C=C _. 


0 
h h 


and so f’(x) = 0. 
(b) Next let f be the so-called identity function, defined by f(x) = x for all real 


numbers x. Then 
f@+h)—f@) x+th-x 


=1 
h h 


and so f’(x) = 1. 
(c) Next let f be the function f(x) = x”. Then 


f@+D-—f@) @H+hP-x?  2xt+h? 
h 7 h — h 


=2x+h 
and so, by the rule for the limit of a sum, 
fms lim (2x +h) = 2x. 


We could go on, but it is far better to use the differentiation rules, as set out in 
the next paragraphs. These allow one to differentiate without considering difference 
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quotients and limits. They make differentiation an almost mechanical procedure, and 
are of immense practical and historical importance. Without them there would be no 
calculus justifying the name. 


5.1.3 Exercises 


1. Differentiate from first principles, that is, using the definition of derivative as limit 
of the difference quotient: 


(a) ax +b, where a and b are constants. 

Oh ae 

(c) x” (n anatural number) 

(dd) Jx 

(ce) x. 
Hint. Use algebraic properties of these functions. The only analytic input 
needed is their continuity. 


2. Differentiate the circular functions sinx and cos x from first principles (that is, 
by calculating the limit of the difference quotient). You will need algebraic input 
in the form of the addition formulas 


sin(u + v) = sinucosv + cosu sinv, 


cos(u + v) = cosucosv — sinu sinv, 


and two facts of analysis to be proved later: the continuity of both functions, and 
the limit : 
. sinx 

lim —— = l. 

x0 2X 
Note. The circular functions will be defined analytically in a later chapter. The reader has 
doubtlessly been introduced to them through school mathematics, in which it is usual to obtain 
the addition formulas by geometry and the limit of sin x/x by geometric intuition. 

3. Differentiate the exponential function e* from first principles. You will need the 

algebraic input that e* satisfies the first law of exponents: 


ery = ee, 


and the analytic input that 
x] 
lim ——— = 1, 
x0 x 


equivalent to giving the derivative of e* at x = 0; this essentially pins down the 
special base e. 
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Note. Just like the circular functions the exponential function and its inverse the natural logarithm 
will be defined analytically in a later chapter. We do not have to take for granted the existence 
of a function with these properties. 

4. An exponential function a* can be defined for any positive base a. It satisfies the 
law of exponents a**” = a*a’. For the sake of this exercise we shall adopt the 
notation E,(x) = a*. Assuming that E, is differentiable derive the formula 


EX) = k, E(x) 


where k, = E’,(0). 

Note. The special base e could be defined as the number that satisfies ke = 1, though there 
would be difficulties involved—for example, why does such a number exist and why is it 
unique? Compare the previous exercise. 

5. The natural logarithm In x is the inverse function to the exponential function 
(Exercise 3) and from it In x inherits the law of logarithms: In(xy) = Inx + Iny. 
Differentiate In x from first principles. 

Hint. You may need to figure out first why limy_.9 n(1 + h)/h = 1. 

6. Let f(x) = |x|. Show that f is differentiable at all points except x = 0. Show 
that f’(x) = x/|x| ifx 4 0. 

7. Let a1, a,..., a, be a strictly increasing sequence of real numbers. Let f(x) = 
Yj=1 |x — a;| for each real x. 


(a) Show that f is continuous at every point x, whereas it is differentiable 
everywhere except at the points a;, (j = 1, ..., 1). 

(b) Show that the derivative is constant in each of the open intervals Ja,, ax+,[, 
as well as in ]—oo, aj[ and in Ja,, oo[, and find a formula for it. 

(c) Sketch the graphs in the cases 


y= |x+1[+ lel +|x—-]| 


and 
y= |x+2|/4+ |x4+ 14+ |x -—1]4+ |x -2). 


8. Let f be the function with domain R defined by letting f(x) = x if x is rational 
and f(x) = 0 if x is irrational. 


(a) Are there any points at which f is differentiable? 
(b) Are there any points at which the function g(x) := xf (x) is differentiable? 


9. Let f :]0, 1[— R be the function defined in Sect. 4.2, Exercise 18. Recall that 
f(x) = 0 if x is irrational and f(x) = 1/b if x is the fraction a/b expressed in 
lowest terms. Show that f is nowhere differentiable. 

Hint. Show that if x is irrational then there exist arbitrarily small h such that 


\(f@ +h) — f@))/h| > 1. 
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5.2 Differentiation Rules 


The elementary differentiation rules put the calculus into analysis. There are two 
groups of rules. The first deals with functions constructed by algebraic operations, 
addition, multiplication and division, from other functions. The second comprises 
the rule for differentiating composite functions (the chain rule) and the rule for 
a at inverse functions. 

Let f : Ja, bl > R, g: Ja, b[ > R. Sun, product and quotient of functions are 
defined pointwise: 


f(x) 
g(x) 


(f + g(x) = fa) +8), (fai) = f@)s@), (4) (x) = 


Take care not to confuse product fg and composition f o g. 

We state the first group of differentiation rules in the following lengthy proposition. 
In the proofs we rely entirely on the limit rules of Sect. 4.2; we never have to say 
“Let e > 0”. 


Proposition 5.2 Let f : ]a,b[| > R, g: Ja, bl > R. Let c be in the interval Ja, b[ 
and assume that both f and g are differentiable at the point c. Let a be a numerical 
constant. Thenaf, f + g and fg are differentiable at c and we have 


(1) (af)’'(c) =af'(c) (Multiplication by a constant) 

(2) (f +3)'(c) = f'(c) + 8'(c) (Sum of functions) 

(3) (fg) (c) = f'(Oge) + flcg'(c) (Product of functions; Leibniz’s rule). 
If moreover g(c) #0, then 1/g and f/g are differentiable at c, and we have the 
further rules: 


fy sOfO=s'OLO 
(4) (4 (= : 

8 (g(c)) 
(5) (2) « y= -£O- (Reciprocal rule) 

-J}(jo=- eciprocal rule). 

8 (g(c))? 
Proof The rule for the derivative of wf is a special case of the rule for fg and that 
for 1/g a special case of that for f/g (left to the reader to see why). 

Now for the proofs of rules 2, 3 and 4. Firstly the sum. We examine the difference 

quotient: 


(Quotient rule) 


(Ft aeth—-Ft+tHyo _ feth—-foO+rsceth — 8) 
h 


h 
_ fleth)= flO: ser) — ee) 
~ h ss h 


and taking the limit we obtain the rule (f + g)/(c) = f’(c) + g’(c). 
Secondly the product. We transform the difference quotient: 
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(fae +h)— (favo) _ fet seth) — fos 
h h 
_ fet ngceth — foste+h) + fosce +h) — fOs) 
h 


g(e th) + fle) 


_ fle+h-fO 
h 


g(c +h) — gc) 
—— a 


We take the limit as h — 0, using the rules for the limits of sums and products, and 

remembering that limy-.9 g(c + 1) = g(c) since g, being differentiable, is continu- 

ous at c. Thus we obtain the rule for the product, (fg)’(c) = f’(c)g(c) + fg’ (c). 
Next the quotient. Again we transform the difference quotient by algebra: 


i f 
(er G)O_ 1 (ren i) 


h h\g(et+th) — g(c) 


7: (f= h)g(c) — fs(e + >) 


h g(c +A)g(c) 
h) — h)— 
= FO) 6 po (E8 a) 
~ g(c + h)g(c) 


We let h — 0, use the rules for limits of sums, products and quotients, remember 
that limy;_,.9 g(c + 4) = g(c) and obtain the limit 


sof’) — rv OfO 
(g(c))? 


5.2.1 Differentiation of the Power Function 


If n is a positive integer and /f is the function x” then we have 
f' (x) =nx"!, 


This is now easy to prove without considering the limit of a difference quotient. We 
use induction. The rule is known for n = 1. Let us assume it holds for a particular 
integer n and write x”+! = x - x”. Using the rule for differentiating a product we 
obtain for the derivative of x"*! the formula 


Le ex ) SS Dx". 
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This proves the rule generally for positive integers n. 
Next we consider f(x) = x” = 1/x”. The rule for the derivative of a quotient 
gives the formula 


xl 


/ a n ies —n-1 
SS a : 


So we have now shown that the derivative of x* is ax*—! in all cases when a is an 
integer (positive or negative). 

What about the power x!/”, which denotes the n" root ./x, or the fractional power 
x/" = ./x™ For these we need the rule for differentiating inverse functions, and 
the celebrated chain rule, often referred to somewhat misleadingly as the rule for 
functions of a function. The latter rule, which we take first, is used to differentiate 
composite functions and is perhaps the most remarkable of the differentiation rules. 


5.2.2. The Chain Rule 


Proposition 5.3 Let f : A— R, g: B—R, where A and B are open intervals 
and f(A) C B. Form the composition go f : A> R, 


(go f)\@) = g(f@)), ( € A). 


Let x9 € A, assume that f is differentiable at xo and g is differentiable at f (xo). 
Then g o f is differentiable at xy and 


(go f)' (0) = 8'(F 0) fo). 


The chain rule is illustrated in Fig. 5.1 


R R R 


Compose functions 
z= g(y) = (ge f)(x) 


f 


go f Multiply derivatives 
(9° f(z) = 9 w)F'@) 


Fig. 5.1 A view of the chain rule 
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Proof of the Chain Rule Let yo = f (xo). As for the previous rules we start by 
applying some algebra to the difference quotient: 


(go f)(%o +h) — (g 0 f)@o) 
h 


_ g(fo+h)) — (fo) flxo th) — f(x) 


f(%o +h) — f (xo) h 


(5.1) 


The second factor on the right-hand side has the limit f’(x9). As for the first factor 
it looks as if it should have the correct limit g’(yo). For we can think of f (xo + /) 
as yo +k (effectively defining the new quantity k) and then the first factor is the 
quotient 
g(yo +k) — yo 
k 


As h — 0 we have k — 0 also and we seem to have a proof. 

But there is a problem here. Although h is not 0 (as befits a correctly formed 
difference quotient) the denominator k, defined to be the difference f (xp + h) — 
Ff (xo), can be 0, and the first factor is then not defined for such values of h. There 
could even exist such values of h that are arbitrarily small which are then impossible 
to escape. 

To save the proof we shall define a function R, the domain of which is a suitably 
small interval ]—qa, a[, in such a way that formula (5.1) for the difference quotient 
is correct if R( f@ot+h) — f (xo) replaces the first factor. 

For a > 0 and suitably small (the reader should try to figure out what “suitably 
small” means in this context and why we have to say it) we set 


8(yo +t) — Blyo) . 
R(t) = : if0 <|t| <a 


g'(yo) ifr =0. 


Note that R is continuous at the point t = 0 because 


_ &(yo +t) — g(yo) 


Moreover 
8(yo +t) — g(yo) = RW) 


both when t 4 0 and when tf = 0. In this equation we replace t by the difference 
fo +A) — f (Xo). This is allowed if || is sufficiently small and then we have 


g(f(%0 +h)) — g(f @o)) = R(f oo +h) — fF (%0))(F Go + A) — f0)). 


Division by h when the latter is not 0, but still sufficiently small, gives 
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(fo +h)) — g(f@o)) _ 
? = 


R(F (x0 +h) — f (20) (= a ae A) . 


h 


Now we may let / tend to 0 and by the limit rules the right-hand side has the limit 
(Jim R(f G0 +h) = f@0))) fo) = RO F'Co) = 8'(fC0)) f’ C0). 


In slightly more detail (we seriously want this proof to be correct) we can intro- 
duce the function ¢(h) := f(xo +h) — f (xo). Then ¢ is continuous at h = 0, and 
(0) = 0. Moreover the function R is continuous at 0 as we saw. Hence the com- 
position R o ¢ is continuous at 0 and so limy;_,9 R(@(A)) = R(@(O)) = R(O), as we 
wrote above. 


5.2.3 Differentiation of Inverse Functions 


This is the last of the elementary differentiation rules. The lengthy preamble repeats 
the conditions (see Proposition 4.12) under which the inverse function exists and 
should not distract the reader from the extraordinary simplicity of the formula that 
is the conclusion. 


Proposition 5.4 


Preamble. Let f : Ja, b[ > R be continuous and strictly increasing (the point a may 
be —o and b may be ow). Let c = limy-4a4 f(x) and d = lim,-,)— f (x) (the limits 
exist if we allow c = —co andd = 00). The inverse function g : \c, d| > R therefore 
exists, is continuous, and maps the interval |c, d{ on to the interval ja, D{. 
Conclusion. Let a < x9 < b and assume that f is differentiable at xo, and that 
J‘ (xo) # 0. Then g is differentiable at f (xo) and 


8 (f (%o)) = 
f' (0) 2 y 
A similar conclusion holds if f is strictly decreasing; the only difference is thatd < c 
and g has the domain |d, c[. 


Proof Let yy = f (xo). We have to show that g’(yo) = f’(xo)~!. Connect the vari- 
ables h and k by the equation 


yo tk = f(xo +h), equivalently h = g(yo+k) — g(y0) 


(recall that g(yo) = xo). The second equation here shows h as a function of k; it is 
a continuous, injective function of k, defined when k is sufficiently small. Moreover 
h = 0 whenk = 0. 

We also have 


5.2 Differentiation Rules 139 


(yo +k) — g(yo) - h a h 
k kf (xo +h) — f(x) 


Let k — 0 and think of h as a function of k, as defined above. Then h tends to 0, but 
is not 0 as long as k # O. By the rule for the limit of a reciprocal, the right-hand side 
has the limit f (xo)7!. 


Assuming that f’(x) 4 0 for all x in the interval Ja, b[, we have the conclusion that 


F-Y¥o)= (5.2) 


1 
f'(F-*O)) 
for all y in the interval ]c, d[ (or in ]d, c[ if f is decreasing). 
We shall see later (Sect. 5.6) that if f’(x) > 0 for all x in the open interval A, then 
f is strictly increasing in A, so that Proposition 5.4 is immediately applicable. 


5.2.4 Differentiation of Fractional Powers 


Let f(x) = x!/", where n is a positive natural number. We have here the inverse 
function of the function g(x) = x”. The domain is the interval ]0, co[. By the rule 
for differentiating an inverse function (that is, we apply (5.2) to the function g with 
x instead of y) we have 


1 1 yin 1 1] 


/ _ —ly = = = 
ff) =(g)@) /(g-!(x)) n(xa)r-! n 


Next we consider the function f(x) = x’"/", where m is an integer, positive or 
negative. This is the composition (x’")!/”. By the chain rule we have 


rie =: dog aly! = m a1 
n n 


The conclusion is striking. The derivative of the power function x“ is ax“! for every 
rational power a. 

It is a further task to define the power function x“ for irrational powers and prove 
that the same differentiation formula continues to be valid. 


5.2.5 Exercises 


1. Differentiate the following functions. You may assume that the domain of each 
function is the set of all x for which the formula makes sense. 


1 
@ aa 72 
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x27—x+1 
b 
(b) x24+x-—1 
x2—-x+1 
©) x2+x-—1 
@) 4 x*—x+1 
x2+x-—1 


(g) fis 1+V1+-/x. 


. Define the function f on the whole real line by 
x, (x < 0) 
_ jl-2V1l—-—x, O<x <1) 
ON a (<x <2) 


Vx2+5-1, (2<x). 


Determine where / is differentiable, and where it is, find its derivative. 

Hint. In this, and in similar examples where a function is defined by cases, the 
differentiation rules are only useful in the open intervals between the partition 
points. At the partition points something else is required, such as arguing by 
examination of the difference quotient. 


. For this exercise we assume some knowledge of the circular functions sin x and 


cos x, including their derivatives (see Sect. 5.1, Exercise 2). Determine where the 
following functions are differentiable and calculate the derivative when it exists: 


sint, (x > 0) 


(a) foy= {5 Oo ew. 


_ J xsin 1, (x > 0) 
(b) f@)= Fs on 

_ x? sin +, (x > 0) 
© fi)= to on 


. Show that the function f (x) = x° + x is strictly increasing on the whole real line 


and calculate (f~!)’(2). 


. Let fr, (kK =1,2,...,n) be differentiable functions. Let g be their product, 


Si fo.» fn. Show that 


g(x) Of) 
g(x) d, fi (x) 


at every point x at which none of the denominators is 0. 
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5.3. Leibniz’s Notation 


There are several notational systems in use for derivatives. They reflect the differing 
views of Newton and of Leibniz. Newton used dots to signify the derivative, as in 
x and y. One might say that the various dashes, as in f’ and f”, popularised by 
Lagrange, reflect Newton’s notation. Leibniz introduced the expressions dx and dy, 
signifying in his view infinitesimal changes in the variables x and y (“d” for Latin 
“differentia’), and leading to the differential quotient dy/dx. He also introduced 
the integral sign “/”, an elongated “S” (for Latin “Swmma”). Each notation has its 
advantages and it is best to learn how to use both. 


5.3.1 Tangent Lines 


We often think of a function f as a curve in the (x, y)-plane. The curve in question 
is the set of all points (x, y) that satisfy y = f(x), in other words the graph of /. 
Leibniz’s notation reflects the geometric intuition behind the idea of a tangent line 
to a curve. 

A line is a curve of the form y = mx +c with constants m (the slope) and c (the 
intercept). The curve x = a is also a line but it is not a graph of a function in the 
above sense. It is though a graph if we think of x as a function of y (in this case a 
constant function). 

We could ask whether every curve in the plane can be described (perhaps locally; 
in small sections at a time) as a graph, in which y is a function of x, or x is a function 
of y. The question arises even for familiar everyday curves like the circle and shows 
the limitation of thinking of a curve simply as a graph. This gets us into the area of 
differential geometry. We would have to give a general definition of curve, a task 
that is not so straightforward. 

The ancient Greek geometers tried to define a tangent line to a curve as a line that 
meets the curve in only one point. This works for circles (and more generally conic 
sections) but not for more complicated curves. Differential calculus allows us to give 
a correct definition of tangent line to a curve when that curve is a graph y = f(x), 
and its extension to differential geometry does the job for more general curves. For 
this reason it is said that differentiation solved the problem of tangents. 

Consider a differentiable function f. The tangent line to the curve y = f(x), ata 
point (xo, yo) on the graph (that this point lies on the graph means that yo = f(xo)), 
is the line through the point (xo, yo) that has the slope f’ (xo). In other words it is the 
line 

y— yo = f' (Xo) (x — Xo) 
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Fig. 5.2 Leibniz’s 
differential quotient 


Secant 
Ay 
Slope x4 


Tangent 


, di 
pay Slope oe 


zg «+A 


or equivalently 
y = f'(xo)x + (0 — Ff’ (X0) x0). 


The intuitive thinking behind this is that the tangent line at (xo, yo) is the limit of 
a secant line through the two points, (xo, yo) and (xp + Ax, yo + Ay), both on the 
graph y = f(x), the limit being taken as Ax — 0. The slope of the secant line is 
Ay/Ax and we want to make Ax, and as a result Ay, tend to 0. 

In the view of seventeenth century mathematicians, who did not possess a defi- 
nition of limit, the quantity Ax was actually supposed to become infinitely small, 
the tangent being thought to intersect the curve at two distinct points infinitely close 
together. For the slope of the tangent we obtain a quotient of infinitely small quan- 
tities, or infinitesimals. This intuition lies behind Leibniz’s notation for derivatives 
(Fig. 5.2). 


5.3.2 Differential Quotients 


Leibniz proposed setting an infinitesimal dx in place of Ax, as the notion of limit 
was not available to him. He would have said that y underwent a corresponding 
change, which was also an infinitesimal dy, and the derivative was the quotient 
dy/dx. Although dx and dy are infinitesimals (whatever that means) the quotient 
is an ordinary real number. He called the infinitesimals dx and dy differentials. The 
derivative was then the differential quotient. 

According to the prevailing modern view the derivative is not a quotient; it is 
though the limit of a quotient, namely the limit of the difference quotient. In spite of 
this it is possible to define differentials, expressed in the classical notation dx and 
dy, without resorting to the mysterious infinitesimals. This is very useful for calculus 
in several variables and differential geometry of surfaces and their generalisations, 
manifolds. It means, for example, that classical formulas, such as dy = f’(x) dx, 
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remain valid with an appropriate interpretation of their symbols. However, that is a 
whole new topic.! 

Here are some examples of statements written using Leibniz’s notation. It will 
be seen that they have certain advantages, notably brevity and flexibility, over their 
equivalents using function symbols: 


d 
(a) ify =x? then = 3x2, 
dx 
That is, if f is the function f(x) = x3 then fix) = 3x2. 
d 
(b) FP = 3x7. 


Same meaning as the previous item. We again avoid using a symbol for the function, 
as well as mentioning the variable y. 

d 

dx x=1 
In other words if f is the function f(x) = x? then f'C) = 3. The vertical stroke 
with the subscript “x = 1” means evaluate the preceding expression at x = 1. 


(c) =, 


5.3.3. The Chain Rule and Inverse Functions in Leibniz’s 
Notation 


Many calculations using the chain rule or the inverse-function rule are easier to carry 
out using Leibniz’s notation. This makes it particularly useful for effecting a change 
of variables in a differential equation, a subject not covered in the present text. 

Functions f and g are given and we wish to differentiate the composed function 
go f. We consider that the function f sets up a relation between variables x and 
y, namely y = f(x), whilst g sets up a relation between variables y and z, namely 
Z = g(y). Then the composition g o f sets up the relation z = (go f)(x). 

We can differentiate the composition go f using the chain rule. In Leibniz’s 
notation we are finding the differential quotient dz/dx and this is given by the 
striking formula 


This is of course just the formula 


(go fY'@) = 8'(f@)) f'@). 


'This has nothing to do with what is known as non-standard analysis. In the latter the real number 
system is extended by including infinitely small quantities and infinitely large quantities. 
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The first factor on the right-hand side, that is dz/dy, must be interpreted with some 
care. We first differentiate z with respect to y, but then express this as a function of 
x, using the relationship between y and x. 

We illustrate these steps by differentiating 1 — x2. We set y = 1 — x? andz = 


/y. Then 
dz _dzdy_ 1 x 


= = 2x) = : 
dx dydx irk V1 — x2 
Consider next inverse functions. If y is a function of x, namely y = f(x), we can 


turn this round and look at x as a function of y, namely x = f~!(y). The rule for 
differentiating f—! takes the memorable form 


dx _ 1/2 
dy dx’ 


This is the same formula as the less intuitive 


1 
PO) = saa! 
LAF -*) 
As an example we shall differentiate the function x!/". Let y = x!/” and turn it 


around giving x = y”. Then 
dx 


oo ts n—-1 
dy a 
so that by the rule we find 
dy 1 1 li, 
n-1 x a 
dx ny"" nyt on 


5.3.4 Tangents to Plane Curves 


In analytic geometry, the simplest way to represent a circle with centre (a, b) and 
radius r is by means of the equation (x — ay + (y- b)? = r?. Here the curve is not 
seen as a graph; in order to do so we must solve for y as a function of x, or for x as a 
function of y. To represent a curve in analytic geometry as a graph, we usually have 
to break it into pieces. 

A simple example is that of the unit circle x7 + y? = 1. Solving for y we obtain 
two solutions, and two graphs: 


y=vV1—x?, (-1<.x <1) the upper semicircle 


y=-vV1—x?, (-1 <x <1) _ the lower semicircle. 
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Now we can differentiate these formulas in order to compute the tangents to the 
circle, using the appropriate formula for each semicircle. 

However, there is another way to calculate the tangent at a point (xo, yo) on the 
curve without solving for y as a function of x. Suppose that we are looking at a part 
of the circle that can be represented as a graph y = f(x), where f is differentiable, 
and contains the point (xo, yo). Then yo = f (xo) and the equation 


x +(f(x~)) =1 


holds for all x in some interval containing x9. We may differentiate with respect to 
x, using the differentiation rules, and obtain 


2x +2 f(x) f(x) =0. 


In particular f’ (x0) = —xo/f (xo) = —xo/Yo- 
The calculation just given would normally be done without introducing a function 
symbol, using Leibniz’s notation 


d d 
r+y=15 jee rae => aa 
dx dx y 


or else a form of Newton’s notation 
a Oy ro , 
x+y =1 5 2x4+2yy =0 > y=-p-. 
y 


This procedure is called implicit differentiation. The differentiation proceeds with 
respect to x, but y is thought of as a function of x, the exact form of which is not 
required. We obtain dy/dx, and express it as a function of x and y, without knowing 
the function y = f (x). Logically, we only need to know that the function f (x) exists, 
and is differentiable. This can usually be guaranteed by a theorem of multivariate 
calculus, the implicit function theorem, which is beyond the scope of this text. 


Example The equation 2y° — xy — x+ = 0 defines some kind of curve in the coor- 
dinate plane. We observe that it contains the point (1, 1). To solve for y as a function 
of x, or for x as a function of y, is difficult (although some algebraic arguments show 
that there is a unique positive y for each positive x; see the nugget “Multiplicity”’). 
Nevertheless, we can calculate the tangent to the curve at the point (1, 1). Assuming 
that we can represent the curve around the point (1, 1) as a graph y = f(x) with 
differentiable f (a fact that can be justified using the implicit function theorem), 
implicit differentiation gives 


; + 4x3 
10yty' — y—xy’-— 407° =0 => y = Toe 
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and therefore at the point (1, 1) we have y’ = 2. Note how differentiating the middle 
term xy gives rise to y-+xy’ because we are thinking of y as a function of x. 
The equation of the tangent line is therefore y — 1 = 3(x — 1), or more simply, 
5x —9y+4=0. 


5.3.5 Exercises 


1. (a) Determine the equation of the tangent to the parabola y = x? at the point 
(t, t?). 
(b) Show that the line perpendicular to the tangent of item (a) and intersecting it 
on the x-axis, passes through the point (0, is independently of fr. 
2. Show that the equation of the tangent to the ellipse 


at the point (xo, yo), assumed to be on the ellipse, is 


BOR ye OD a 
a2 Bes 
3. Give an example ofa graph y = f(x) (with differentiable f) anda point (a, f(a)) 
on the graph, such that the tangent at (a, f(a)) crosses the graph at (a, f(a)). 
4. A vessel has the shape of a right circular cone standing on its apex. Let h be the 
height of the cone and let r be the radius of its base. Mercury is poured into the 
vessel, not necessarily at a constant rate. Introduce variables: t for the time, v for 
the volume of mercury in the vessel and y for the height of the mercury in the 
vessel. 


: : . dv dy 

(a) Find the relationship between at and —. 

(b) Suppose that h = 1 m, r = 1 m, y = 0.5 m and the mercury is poured at a 

constant rate of | litre per second. Approximately, how much time is needed 
to raise the surface level by | cm? 
Note. Physics and engineering abound with problems like this one. A bunch of variables 
are connected by a constitutive relation. In this problem the relation between v and y is 
geometric. Examples from physics are pressure, volume and temperature connected by the 
ideal gas equation; or stress and strain connected by the law of elasticity. If the variables 
change with time, then the constitutive relation implies a linear connection between their 
derivatives with respect to time. If the variables are three or more then the problem really 
requires multivariate calculus, in particular partial derivatives. With two variables we can 
just about get by without them. 
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5.4 Higher Order Derivatives 


Ifthe function f : Ja, b| > Ris differentiable, its derivative f’ : Ja, b| ~ Risanew 
function. Now it could happen that f’ is differentiable. If so, we can differentiate and 
produce the function f” : Ja, b[ > R, called the second derivative of f. Continuing 
in this way as far as is allowable, we can define a whole sequence: second, third, 
fourth, fifth, ..., n'" derivatives of f. They are denoted by f”, f”", f’”, f’” and so 
on, but around the fourth it becomes more practical to write instead f, f, ...., 
f™... as counting those little dashes becomes tiresome and irritating. When using 
this notation it is often convenient to allow n = 0 and interpret f to be the same 
as f. 

The differentiation can be continued beyond f when the latter is differentiable 
on the interval Ja, b[, where f was defined. It could happen that every function 
produced in this way is differentiable. Then we say that /f is infinitely often differen- 
tiable. If the process can be continued at least as far as f we say that f is n-times 
differentiable, or that f is differentiable to order n, or that f has derivatives to order 
n (none of which precludes going further). 

Can we give any sense to the statement that f is n-times differentiable at the 
point c? For a function to be differentiable at a given point it must be defined on an 
interval that contains that point. Therefore the meaning to be attached to this phrase 
is the following. There exists 6 > 0, such that f is (n — 1)-times differentiable in the 
interval Jc — 5,c + d[and f~ is differentiable at c. We sometimes say in this case 
that the derivatives f“(c) exist up to k =n; or, most briefly: f has n derivatives 
at c. 

Leibniz’s notation for the higher derivatives is 


d™ y 
dx™ 


_ dy, dy, 
y= f(x), dn tO) iat 


= fC), 


5.4.1 Exercises 


1. Let f be a polynomial of degree m. Show that f = 0 for all k > m. 

2. Let g be a function having derivatives of all orders and let a be a real number, 
such that g(a) £ 0. Set f(x) = (x — a)" g(x), where m is a positive integer. 
Show that f(a) = 0 for k = 0, 1, ..., m — 1, but that f(a) £ 0. 

3. Show that 

a a a—k 

ae =a(a—1)...(a-—k+1)x*™. 
Here you may assume that a is rational (pending the rigorous definition of irra- 
tional powers in Chap. 7). Also you may assume that x > 0 if a is not an integer. 
If a is a positive integer show that 
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k ! 
a, a! tee 


dxt* ~(@—-b! 


for k < a. One can even allow k > a if we interpret 1/m! as 0 if m is a negative 
integer. 


. We assume some knowledge of the exponential function e*, namely, that its 


derivative is again e*. Let f(x) = e~!/* for x #0. Show that for all natural 
numbers n we have 


FO (x) = Pra (<) ev, (x £0), 


where for each n, P,,(t) is a polynomial in the variable ¢ of degree 21. Find a 
recurrence formula for P,,(t). 


. Prove Leibniz’s formula for the n" derivative of a product. If u and v are functions 


with derivatives up to order n, then uv has derivatives to order n and 


(uv) = > Ce. 


k=0 


. Calculate some higher derivatives of the composite function y = g(f (x)), as far 


as your patience allows. 


. A function is defined by 


—x>, ifx <0 
fa) =| x, if a0, 


How many derivatives does f possess at x = 0? 


. A function with domain R is called an even function if it satisfies f(—x) = f (x) 


for all x. It is called an odd function if it satisfies f(—x) = — f(x) for all x. 


(a) Show that every function f with domain R has a unique decomposition 
f =g +h where g is even and h is odd. 

(b) Suppose that f has m derivatives at x = 0. Show that if f is even, then all 
derivatives f )(Q) with odd k < m are zero. Show, on the other hand, that 
if f is odd, then all derivatives f (0) with even k < m are zero. 


. How many derivatives does the function |x|’/? possess at x = 0? 
. Define a function f on the domain ]—oo, I[ by 


ifx <0 


foe |” 
N= Vie ae, tS ee 


Show that f is differentiable at all points of its domain, that f’ is continuous, and 
that f is twice differentiable at all points except at x = 0. At x = 0 the second 
derivative does not exist; but calculate its “jump’’, the quantity 
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lim f"’(x)— lim f”(x). 
x>0+ f ( ) x—0- f ( ) 


Note. Where a straight section of rail track joins a curved section, it is safer if the curve 
is designed so that the second derivative is continuous and is 0 at the join. This is to avoid 
discontinuities in the acceleration normal to the track. The graph in the exercise typifies the 
join in a model railway, where the curves are usually arcs of circles, and that is where the 
model train is most likely to leave the track. 

11. Variables x and y are connected by the equation 2y> — xy — x* = 0. Calculate 
the second derivative d*y/dx* when x = 1 and y = 1. 

12. In this exercise we assume some acquaintance with determinants. Let wu, and u2 
be differentiable functions in an interval A. 


(a) Suppose that the functions uw; and wz are linearly dependent in A; by this is 
meant that there exist constants 4; and A2, not both 0, such that 
Ayuy(X) + Agu2(x) = 0 


for all x in A. Show that, for all x in A: 


ui(x) u2(x)| _ 


u(x) us(x)) 


(b) The example wj(x) = u2(x) = 0 for x < 0 and uj (x) = x7, u2(x) = 2x? 
for x > O shows that the converse is false. 

(c) Extend the result of item (a) to the case of m functions w,..., Um, each 
m — | times differentiable. Show that a necessary condition for their linear 
dependence in A is that 


uy (x) u(x) tee Um(X) 
ui (x) uy(x) ... u(x) 


o 4 Pea 
ul” (x) ie (x) OG) 


for all x in A. 


5.5 Significance of the Derivative 


In this section we begin to extract useful information about a function from knowledge 
of its derivative. 
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If A is an interval we call a point c in A an interior point if c is not an endpoint 
of the interval. In the next paragraphs a set denoted by A will always be an interval 
with distinct endpoints. 


Proposition 5.5 Let f : A — R, let c be an interior point of A and let f be differ- 
entiable at c. Then the following hold: 


(1) If f'(c) > 0 there exists 5 > 0, such that 

fx) <f(Q ifc-8<x<c, and f(x)>fO ifc<x<ct6. 
(2) If f’(c) <0, then there exists 5 > 0, such that 

fa) >f(O ifc-—8<x<c, and f(x) <f ifc<x<ct6. 
Proof Let f'(c) > 0. Now 


_ f(e+h)— fc) 
im ——W——_, 


—>0 h 


f=! 
and taking e to be ; f’(c) in the definition of limit we find that there exists 6 > 0, 


such that 
f(e+h)— fc) S f'() 
h 2 


for all h that satisfy 0 < |h| < 6. For such h that are negative we have 


h / 
ficth)— flo < ro <0 
and for such h that are positive we have 
h / 
ficth-f@O> eo > 0. 


The case when f’(c) < 0 is treated similarly. 


We did not assume that f was differentiable at points other than c. But even if it 
is, the assumption that f’(c) > 0 tells us little about the derivative f’(x), for x near 
to c. We could have points x, arbitrarily near to c, at which f’(x) < 0, for example. 
Or even points at which f’(x) is arbitrarily large. 
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5.5.1 Maxima and Minima 


One of the main applications of the last paragraph is to the problem, familiar from 
applied mathematics, of finding maxima and minima. Problems of this nature are 
generally called extremal problems. 

Let f : A—R. Recall that A denotes an interval, with or without endpoints, 
though the latter must be distinct. The status of a point c in A regarding the local 
extremal behaviour of f can be usefully, if somewhat pedantically, classified as 
follows: 


(a) The point c is called a local minimum point for f if there exists 6 > 0, such that 
f(x) = f(©) for all x in A that satisfy |x —c| < 6. 

(b) The point c is called a local maximum point for f if there exists 6 > 0, such that 
f(x) < f(©) for all x in A that satisfy |x —c| < 6. 

(c) The point c is called a strict local minimum point for f if there exists 6 > 0, 
such that f(x) > f(c) for all x in A that satisfy 0 < |x —c| <6. 

(d) The point c is called a strict local maximum point for f if there exists 5 > 0, 
such that f(x) < f(c) for all x in A that satisfy 0 < |x —c| <6. 


Note that c could be an endpoint of the interval A in these definitions. Moreover 
c could belong to none of the above four classes, in which case it is of no interest as 
regards the extremal problem for f. 

The next proposition defines precisely the notion, loosely expressed, that the 
derivative vanishes at a maximum or minimum. 


Proposition 5.6 Let f : A > R, letc bea point in A and assume that c is either a 
local minimum point, or a local maximum point, of f. If, in addition, c is an interior 
point of A and f is differentiable at c, then f'(c) = 0. 


Proof Consider the case when c is a local minimum point. If f’(c) < 0 then, by 
Proposition 5.5, there exists 6 > 0, such that f(x) < f(c) ifc<x<c+6. If 
f'(c) > O then there exists 5 > 0, such that f(x) < f(c) ifc—6 <x <c. Innei- 
ther case can c be a local minimum point, so we have a contradiction. We conclude 
that f’(c) = 0. A similar argument is used for the case when c is a local maximum 
point. 


That the derivative is 0, given that c is an interior point and f is differentiable 
at c, is only a necessary condition for c to be a local minimum or maximum point. 
It is not sufficient. There is a need for a term to cover the case that f’(c) = 0, 
irrespective of whether c is a local maximum or minimum point. The terms extreme 
point, extremal point, stationary point and critical point have been used (and there are 
probably others). The last two should be preferred as they do not suggest a maximum 
or minimum. 
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5.5.2 Finding Maxima and Minima in Practice 


Let f : [a,b] — R be a continuous function. The domain [a, b] is a bounded and 
closed interval. We know by the extreme value theorem (Proposition 4.11) that f 
attains both a maximum and a minimum value in [a, b]. The problem of maxima and 
minima is to find the points where these are attained, as well as the maximum and 
minimum values. 

Suppose that the maximum is attained at a point x = c (there could be more than 
one such point). There are three possibilities (exclusive; each excludes the other two): 


(a) The point c is either a or b. 

(b) The point c is an interior point, that is, a point of the open interval 
Ja, b[, f is differentiable at c and (by Proposition 5.6) f’(c) = 0. 

(c) The point c is an interior point at which f is not differentiable. 


The most usual situation is that there are only a finite number of points c in [a, b] 
that satisfy any one of these three conditions. It may be feasible to find them, and 
once found, to arrange them in a list. This might begin with the endpoints a and b, 
continue with the points in Ja, b[ at which f is not differentiable (if finitely many) 
and conclude with all the solutions of f’(x) = 0 in Ja, b[ (if finitely many). Now it 
only remains to calculate f at each of the points in the list and find the highest and 
lowest of these values. 


5.5.3 Exercises 


1. In each of the following cases determine the maximum and minimum of the 
function f over the interval A: 


(a) f(x) =x? -3x7 +x, A=[lI,3]. 
(b) f(x) = max (1 —2x —x?, 2+x—x?, 14+3x—x*), A=[-1,2]. 


4 
a I= aie Pees 


Hint. In items (b) and (c) it helps to express the functions by cases. For the 
numerical work in these exercises it makes sense to use a calculator and state the 
answers with a certain number of decimal digits, say, three. 

2. Determine the minimum of the function 


1 
f@)=H=x+5 
re 


in the interval ]0, co[. 

3. Let aj, d,..., a, be a strictly increasing sequence of real numbers. Let f(x) = 
ae |x — a;| for each real x. Determine the minimum of f over the whole real 
line. 
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4. Define the function f with domain R by f(x) = x? sin(1/x) for x 40, and 
f() = 0. Show that f is everywhere differentiable, including at x = 0, but that 
f’ is discontinuous at x = 0. 

5. Find an example of a continuous function f that has a strict local minimum at 
x = 0, and for every 6 > 0 has also a strict local minimum in the interval ]0, 6[. 
Hint. To help you think about it, note that there would have to be infinitely many 
minima in JO, 6[, each with a value higher than f (0). 

6. Give an example of a differentiable function f, such that f’(0) > 0, but there 
exists no 6 > 0 such that f is strictly increasing in the interval ]—64, 6[. 

Hint. Try to exploit the wildly oscillating function sin(1/x). 

7. Give an example of a differentiable function f such that f’(0) = 0 and in every 
interval ]—6, d[ (withé > 0) the derivative f’ takes arbitrarily large positive values 
and arbitrarily large negative values. 


5.6 The Mean Value Theorem 


It has been called the most useful theorem in analysis (notably by the influential 
French mathematician and Bourbakiste, Jean Dieudonné, but he was probably echo- 
ing G. H. Hardy). We leave it to the reader to judge the truth or otherwise of this 
claim. It might seem more logical to write “mean-value theorem’, as there is nothing 
mean about it, nor is it one of a collection of value theorems. The lack of a hyphen is 
sanctioned by usage, as it is in the names of other theorems with compound qualifiers 
(as in “small oscillation theorem’’). 

In the following, the interval [a,b] has distinct endpoints, and is manifestly 
bounded and closed. 


Proposition 5.7 (Rolle’s theorem) Let f : [a,b] — R be acontinuous function that 
is differentiable fora < x < b. Assume that f (a) = f(b). Then there exists c, such 
thata <c < band f'(c)=0. 


Proof Let m = infja5 f and M = sup,,,,; f (both m and M are attained by the 
extreme value theorem, so they are minimum and maximum). If m = M then f isa 
constant and so f’(x) = 0 for all x in Ja, b[ and we are done. 

Assume next that m < M. If these values are attained at the endpoints, then, since 
f(a = f(b), we again have m = M. So at least one of them is attained at an interior 
point. Either there exists c in Ja, b[ such that f(c) = m or there exists c in Ja, b[ 
such that f(c) = M. In both these cases we have f’(c) = 0. 


Proposition 5.8 (Mean value theorem) Let f : [a, b] — R be acontinuous function 
that is differentiable for a < x <b. Then there exists c in the open interval ]a, D{, 
such that 


fb) — fa = fob —a). 
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Proof Set 
_ fb) — fla) 
> b-a : 


A 


Then f(a) — Aa = f(b)— Ab. Define the function g(x) = f(x) — Ax for 
a <x < b.Now g(a) = g(b) and we deduce that there exists c, such thata <c <b 
and g/(c) = 0. Then f’(c) — A = 0 and we find 


f(b) — f@ 


poy a LO 


which gives f(b) — f(a) = f’(c)(b—a). 


5.6.1 First Consequences of the Mean Value Theorem 


The following reformulation of the mean value theorem is often useful. 


Proposition 5.9 (Mean value theorem, version 2) Let A = Ja, bl andlet f : A> R 
be a differentiable function. Let x and x +h both lie in A (note that h could 
be negative). Then there exists 0, such that0 <0 <1 and f(x+h)= f(x) + 
hf'(x + Oh). 


Proof Apply the mean value theorem to the interval with endpoints x and x +h, 
and write the point c in the form c = x + 6h. Then0 < 6 < 1. 


It is surprising that only now, with the mean value theorem in place, do we have 
the machinery to give a nice proof of the following “obvious” result. 


Proposition 5.10 Let A be an open interval (which could be unbounded), let f be 
differentiable in A and suppose that f'(x) = 0 for all x in A. Then f is a constant 
in the interval A. 


Proof Leta and b be points in A. By the mean value theorem we have f(b) — f (a) 
= f’(c)(b — a) for some c between a and b. But then f(a) — f(b) = 0, that is, 
f(a) = f(b). We deduce that f is constant in A. 


It is important that A should be an interval. If, for example, A is the set 
JO, 1[ U ]2, 3[, then there exists a function with domain A, differentiable and sat- 
isfying f’(x) = 0 at every point of A, but f is not constant in A. 

Another important application is to give a criterion for a function to be increasing 
or decreasing. 


Proposition 5.11 Let the function f :|a, b[—> R be differentiable and assume that 
f'(x) > 0 for all x in Ja, b[. Then f is strictly increasing. If, on the other hand, 
f'(x) < Ofor all x in Ja, b{, then f is strictly decreasing. 
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Proof Let f’(x) > Oforallx in Ja, b[ andleta < x; < x2 < b. Then there exists cin 
]x1, X2[, such that f (x2) — f (x1) = f’(c)(x2 — x1) > 0. The argument for decreas- 
ing is similar. 


Note that if f is strictly increasing we cannot conclude that f’(x) > 0 for all x. 
However, dropping the strictness, we obtain a correct result: a function differentiable 
in an open interval is increasing if and only if f’(x) = 0 for all x in its domain. See 
the exercises. mul 


5.6.2 Exercises 


1. Find a function f for which the domain is the open set ]0, 1[ U ]2, 3[, and is such 
that the derivative of f is zero at each point in its domain, but f is not a constant 
function. 

2. Show that Rolle’s theorem is true in case f is defined and differentiable in the 
open interval Ja, b[, and lim,.,; f(x) = lim,.,— f(x). Note that a could be 
—oo and b could be +o. Furthermore the two limits could also be infinite. 

3. Find an example of a function /, differentiable and strictly increasing in an open 
interval A, but for which the inequality f’(x) > 0 fails for at least one point x. 

4. Show that a function /, differentiable in an open interval A, is increasing if and 
only if f’(x) => 0 for all x in A. 

5. Let f be a function on the domain Ja, b[, where b is a finite number. Suppose that 
f is differentiable and there exists a constant K, such that | f’(x)| < K for all x 
in the domain. Show that lim,_,,— f(t) exists and is a finite number. 

6. Suppose that f is known to be continuous in an interval Ja, b[, differentiable at all 
points in Ja, b[ except possibly at a point c, and it is known that lim,_,. f’(x) = @, 
where £ is a finite number. Show that f is differentiable at c and f’(c) = €. 
Note. In cases when f(x) is given by a nice formula for x ¢ c (such as in Sect. 5.2, Exercise 
3) many a beginning student might compute the derivative f’(c) correctly by taking the limit 
lim,.¢ f’(x) without appreciating that it needs justification. Should a teacher give “correct”? 

7. Let the function f be defined and continuous in an open interval A. Suppose 
that c is a point in A and that f has derivatives up to order m on the set A \ {c}. 
Suppose further that lim,.. f (x) exists for k = 1,...,m and the limits are 
finite numbers. Show that f has derivatives up to order m in all of A. Moreover 
fo) = lim... f(x), fork = 1,...,m. 

8. We have seen that a continuous function f on a domain A, where A is an open 
interval, has the intermediate value property: if a and b are points of A, and 7 lies 
strictly between f(a) and f(b), then there exists c strictly between a and b such 
that f(c) = n. 

There is another general class of functions that possess the intermediate value 
property. Suppose that f is differentiable everywhere in A. There is no reason to 
suppose that f’ is continuous; indeed it may have discontinuities. Nevertheless 
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f' has the intermediate value property. Prove this. 
Hint. Begin with the case when f’(a) > 0, f’(b) < 0 and show that there exists 
c between a and b such that f’(c) = 0. 
Note. The intermediate value property for derivatives is often treated as an intriguing but oth- 
erwise rather abstruse fact. Actually it is often useful in conjunction with the fact embodied in 
Sect. 4.3, Exercise 6, when multiple applications of the mean value theorem or Taylor’s theorem 
are used to derive remainder formulas. 

9. Suppose that f is differentiable and that f’ is monotonic. Prove that f’ is con- 
tinuous. 


5.7. The Derivative as a Linear Approximation 


If f is differentiable at a we can rewrite the formula 


f(a) = tim LO=LO 


xa x-—a 


by defining R(x, a) to satisfy 


f(x) = f(a) + (« —a) f'(a) + RG, a), 


and viewing R(x, a) as the error when f (x) is approximated by the first-degree poly- 
nomial f(a) + (x — a) f’(a). The approximation improves as x approaches a, not 
just because lim,_,,, R(x, a) = 0, but because of the stronger conclusion (embodied 
in the definition of derivative as limit) that 


f'@— 0, (when x > a). 


RQxa) _ f)-f@ 
a 


X—a i= 


The error becomes arbitrarily small in comparison with x — a as x approaches a. 


5.7.1 Higher Derivatives and Taylor Polynomials 


Now suppose that f’(a), ..., f(a) all exist (for the meaning of this see Sect. 5.4). 
As a generalisation of the above approximation rule using the first derivative, we 
have the more complicated 


1 1 
FR) =f@+t &-Af@+ Fe-aP f'"@+---+ a" fM@) 
+ Rix, a) 


5.7 The Derivative as a Linear Approximation 157 
and the error R,+1(x, a) satisfies 


é Rix, a) 
lim ———\— = 
x>a (x —a)™ 


The polynomial 


Py(X,a) = fla) + —a)f'(a) + 5 — a)? f"(a) +--+ Oe — a" Fa) 


is called the Taylor polynomial of f with degree m centred at a. As a polynomial in 
x it may have degree less than m so maybe one should use the term “order” instead of 
“degree”. Whichever term one uses it is an approximation to f(x) which improves 
as x approaches a, in the sense that the error becomes arbitrarily small in comparison 
with (x — a)". The proof of this will be given shortly. 


5.7.2 Comparison to Taylor’s Theorem 


The claim made in the last subsection is not what we nowadays call Taylor’s theorem. 
That celebrated result, which will be proved later, consists essentially of an estimate 
for the error R,, (x, a), that often allows us to conclude that if x is kept constant and 
if f satisfies certain additional conditions, then R(x, a) tends to0 as m — oo. 
Sometimes the conclusion stated in the previous section is called Peano’s form of 
Taylor’s theorem, and sometimes Young’s, but these terms can be confusing and it 
is better to maintain a clear distinction between it and Taylor’s theorem. One might 
add another source of confusion, that most named attributions of versions of Taylor’s 
theorem are historically questionable (including the attribution to Taylor). 
Compare these conclusions. In all cases we have 


f(x) = Py(x,a) + Rn, a). 
The conclusion stated in the previous section was 


R , : 
lim Ri, a) =0 (note: m is held constant). 
x>a (x —a)” 


Taylor’s theorem can sometimes justify the conclusion, that for all x in some interval 
ja —h,a +h, we have 


lim R»(x,a)=0 (note: x is held constant) 
m—> Ooo 


but to obtain this conclusion a close examination of f is usually needed. 


158 5 Derivatives and Differentiation 


5.7.3 Cauchy’s Form of the Mean Value Theorem 


Proposition 5.12 Let f : [a,b] > Rand g: [a,b] > R be continuous functions, 
that are differentiable for a < x < b. Then there exists c in the open interval a, D{, 
such that 


(f(b) — f@)s'© = (g) - s@)f'. 
Proof Let h(x) = Af (x) + Bg(x) for a < x < b. Choose the constants A and B 
so that they are not both zero but h(a) = h(b) (possible in many ways by simple 
algebra). Then there exists c in Ja, b[, such that h’(c) = 0, which is to say 
Af’ (c) + Bg'(c) = 0. 
But since h(b) — h(a) = 0 we also have 


A(f(b) — f(a) + B(g(b) — g(a) = 0. 


Now A and B are not both 0, so by a popular rule of linear algebra we must have 


(f(b) — f(a@))g'(c) = (g(b) — g(a) fo) 


as required. 
The formula in Cauchy’s mean value theorem can be written in the memorable 


form 
f(b) — fia) &: f') 
g(b)— g(a) g"(c) 


provided neither denominator is zero. 
Another way to state and prove Cauchy’s mean value theorem uses determinants. 

We set 

f(b) -— f@ fF) 


bye aay 8G) 


? 


note that d(a) = #(b), and deduce, by Rolle’s theorem that, for some c between a 
and b we have 


fO)- f@ fie) 
g(b) — g(a) gic) 


5.7.4 Geometric Interpretation of the Mean Value Theorem 


Consider the curve y = f(x) inthe (x, y)-plane between x = aandx = b. The mean 
value theorem says that there exists a point c between a and b where the tangent line 
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Fig. 5.3. The mean value y 
theorem at a glance 


Fig. 5.4. Cauchy’s mean y 
value theorem at a glance 


to the curve at the point (c, f(c)) is parallel to the chord joining the points (a, f(a)) 
and (b, f(b)). There may be more than one point with the property possessed by c. 
This is illustrated in Fig.5.3. 

Cauchy’s mean value theorem also has a geometric interpretation, but requires 
the use of plane, parametrised curves. We assume that the reader is acquainted with 
these and with vectors in the plane. 

The equation 


(f(b) — f(a))g'(c) = (g(b) — g(a) f'(o) 


says that the vectors (f(b) — f(a), g(b) — g(a) and (f'(c), g’(c)) are parallel (as 
is clear from the determinantal version given in the last section). Consider now the 
parametrised plane curve 


x= f(t), y=g(t). 


Given the parameters t = a andt = J, yielding points (f(a), g(a)) and (f(b), g(b)) 
on the curve, there exists a parameter t = c between a and b, such that the tangent 
line to the curve at parameter t = c is parallel to the chord joining (f(a), g(a)) and 
(f(b), g(b)). This is illustrated in Fig. 5.4. 

There may be more than one tangent to the curve at the plane point (f(c), g(c)), 
as the curve may happen to cross itself at this point. That is why we refer to the 
tangent as being at parameter value c, rather than at the point (f(c), g(c)). 
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5.7.5 Exercises 


1. The idea that f(a) + hf’(a) is an approximation to f(a +h) gives us a useful 
rule of thumb for improving an initial approximation to f(a + h). Try it in the 
the following examples. You can test the improvement using a calculator. 


(a) 65 with initial approximation 8. 
(b) 28 with initial approximation 3. 
(c) 626°/4 with initial approximation 125. 


2. Let f be twice differentiable in an interval A. Show that for every distinct pair of 
points a and x in A there is a point &, strictly between a and x, such that 


ij 1 " 
f(x) — fla) — fi(@x-—a)= af (€)(x —a)’. 
Hint. Apply Cauchy’s mean value theorem twice in a row to the quotient 


f@)-f@-f'@a-a 


@=ey 


Note. This result is a particular case of Taylor’s theorem. It enables us to estimate the error, 
when f(x) is approximated by f(a) + f’(a)(x — a), if we know some bound for the second 
derivative. 

3. Use the result of the previous exercise to estimate the error in the approximations 
of Exercise 1. 

4. Prove Liouville’s theorem. Let a be a root of the polynomial equation P(x) := 
y)X" + dy_1x"—! +-++++ a9 = 0 where the coefficients ao, dj,... dy, are integers 
and a, ~ 0. Assume also that this equation has no rational solutions. Prove that 
there exists anumber c > 0, such that for all rational numbers p/g the inequality 


holds. Use this to show that the number 


CO 
L:= Pa Or” 
n=1 


is not the root of any polynomial equation with integer coefficients. 

Hint. Assuming first that |a@ — (p/q)| < 1 apply the mean value theorem to the 
difference P(a) — P(p/q). Then remove the assumption. 

Note. Real numbers can be divided into two classes. The algebraic numbers are those that are the 
roots of polynomial equations with integer coefficients. These include not only all the rational 
numbers, but also numbers like af 2,V2+ V2 and ¥/ 5, that is, numbers expressible by radicals. 
They also include numbers like the positive root of x> — x + 1 = 0 which cannot be expressed 
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by radicals. Irrational numbers that are not algebraic are called transcendental numbers. For a 
long time it was not known whether any transcendental numbers existed. Then finally, in 1844, 
Liouville exhibited an example of a transcendental number, similar to the one appearing in this 


exercise. 


5.8 L’Hopital’s Rule 


In its simplest form this popular and useful rule is as follows. Let A be an open 
interval, and f : A > R, g: A > R differentiable functions. Let c be a point in A, 
and suppose that f(c) = g(c) = 0, but that g’(c) 4 0. Then we have 


f@) _ fo 
eg) gO" 


To prove this we simply observe that for x 4 c we have 


fa) -— fo 
f(x) = neat 6 
g(x) g(x) — g(c) 
xX—-C 


and this tends to f’(c)/g’(c) as x tends to c, by the definition of derivative and the 
rule for the limit of a quotient. 

L Hopital’s rule can be framed in a more general form that vastly increases its 
usefulness. We no longer assume that the derivatives f’(c) and g'(c) exist. Instead 
we assume that f’(x)/g’(x) tends to a limit as x tends to c. It is even more useful 
to take the limit as one-sided; after all, a two-sided limit is just a pair of one-sided 
limits that happen to be equal. 


Proposition 5.13 (L Hopital’s rule for 0/0) Let f : Ja, bl > R, g: Ja, bl > R be 
differentiable functions such that g'(x) 4 0 for all x in the interval of definition. 
Suppose that 


lim f(x)=0, lim g(x)=0, lim PO) 
Xe yy sat > ysat g(x) 
Then 
i f(x) 
1m =i? 
x>at g(x) 


A similar conclusion holds for the limit lim,-.p»— f (x)/g(x). 
The rule also holds if a = —oo orb = &; or ift = ow ort = —o. 
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Proof Consider first the right-hand limit at a in the case that a is not —oo and t is a 
finite number. 
Let ¢ > 0. There exists 5 > 0, such that 


for all x that satisfy a< x <a-+6. Let x and y satisfy a< y<x <a+6. By 
Cauchy’s form of the mean value theorem there exists z between x and y, such that 


FO) = FO) _ f'() 
g(x)—g(y) —g’(z) 


We deduce that for all such x and y we have 


fx) — FO) 


g@)—-go) |" 


Let now y > a+. We have that 


km L- FO) _ f@) 
im = ; 
yoat g(x)—g(y) g(x) 
so that the inequality 
Fx) _ 
g(x) 


t}<eé 


holds for all x that satisfy a < x < a + 6.This proves the first assertion of L’Hopital’s 
rule. 

Next consider the case when b = oo and ¢ is a finite number. We will determine 
lim, .o. f(x)/g(x), the assumptions being that lim,_.., f(x) and lim,_,.. g(x) are 
both 0, and limy.o f’(x)/g/(x) = t. 

Let ¢ > 0. There exists K, such that 


for allx > K.Let K < x < y. There exists z between x and y, such that 


fao-f0) _ f'@ 
g(x)— gly) g’(z) 


and therefore, for all such x and y we have 


f(x) — FO) 


—ft|<e. 
g(x) — gy) 
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Now let y > oo. We find that 


for all x that satisfy x > K, thus proving the rule in this case. 

Consider the case when ft = ov, and a is a finite number. We shall determine 
limy+a+ f(x)/g(x), the assumptions being that lim,.,4 f(x) and lim,44 g(x) 
are both 0, and lim,a4 f’(x)/g’(x) = ©. 

Let K be areal number and choose 6 > 0, such that 


f'(x) 
g(x) 


for all x that satisfy a < x < a+. For all x and y that satisfya << y<x<a+6 


we obtain 
f(x) — fy) 
———__—_ > K 
g(x) — g(y) 


Let y > a+. We deduce that f(x)/g(x) => K for all x that satisfy a < x <a+6. 

The reader should write out the proofs for all the remaining cases; each is similar 
to one of the cases treated above. The common feature is the use of Cauchy’s mean 
value theorem. 


5.8.1 Using L’Hopital’s Rule 


There are two important things to bear in mind when one uses L’ Hopital’s rule. Firstly, 
f(x) and g(x) should both tend to 0 at the point where the limit of f(x)/g(x) is 
sought. This is why we sometimes say that the rule resolves the indeterminate form 
0/0. Failure to observe this can lead to mistakes. 

Secondly, we must observe the premise that f’(x)/g’(x) has a limit. Thus it is not 
strictly correct, having first observed that f(x) and g(x) both tend to 0, to write that 


win SPE Se, ah GE) 
im = lim ; 
wat g(x) oat g/(x) 


before ascertaining that the limit on the right actually exists. For example there are 
cases when the limit on the left-hand side exists, but the limit on the right does not. 
Even so, we often write this, in the spirit of “let’s wait and see,” especially when the 
rule is used iteratively (more on this later) and it rarely leads to mistakes. 
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5.8.2 Is There an Error in the Proof? 


The claim that 
f@)-FfO) _ fF) 


1m = 
yoat g(x)— gly) — g(x) 


appears in the proof of L’Hopital’s rule. For this to be correct we must know that 
g(x) 4 0. We are assuming that lim,_.,, g(x) = 0 and there appears to be a danger 
that g(x) might be 0 for some values of x near to a, maybe even for infinitely many 
values. 

In fact we are safe on this score. One assumption was that g’(x) 4 Ofora <x <b. 
Therefore given ¢ > 0 we can (referring to the proof) find 5 > 0 having the properties 
stated in the proof, but also such that g(x) # 0 for all x that satisfy a <x <a+6. 
This is because the equation g(x) = 0 can have at most one solution in the open 
interval Ja, b[, since otherwise, by Rolle’s theorem, g’ would have a zero in Ja, b[. 

The assumption that g’(x) /= 0 for all x in its domain of definition is unnecessar- 
ily strong for applying L’ Hopital’s rule to calculate lim,_,,, f(x)/g(x). Obviously 
it is enough that there should exist h > 0, such that g’(x) 4 O fora <x <a+th. 


5.8.3. Geometric Interpretation of L’Hopital’s Rule 


The two functions f and g define a parametric curve in the (x, y)-plane by letting 
x = g(t), y= f(t) fora < t <b. The assumptions that lim,;.,4 f(t) = lim;+a+ 
g(t) = 0 have the geometrical interpretation that the initial point of the curve is the 
coordinate origin O = (0,0). Let P(t) denote the point (g(t), f (t)) in the plane, 
that is, the point on the curve with parameter t. Associated with the point P(t) on the 
curve we can construct two lines. Firstly, the tangent at the point P(t), corresponding 
to parameter ¢. Careful! The curve might cross itself. Secondly the chord joining P(t) 
to the origin O. 

L’Hopital’s rule says the following: if the slope of the tangent has a limit as tf 
tends to a (from the right), then the slope of the chord joining P(t) to O has the same 
limit. The result is also valid if the limit is infinite; both chord and tangent then tend 
to a vertical position. The geometric interpretation of L’ Hopital’s rule is illustrated 
in Fig. 5.5. 


5.8.4 Iterative Use of L’Hopital’s Rule: Taylor Polynomials 
Again 


If we wish to find the limit lim,_,,, f(x)/g(x) using L’ Hopital’s rule we are directed 
to find the limit lim,_,,, f’(x)/g’(x). This limit, too, may be found by L’Hopital’s 
tule if it happens that lim,.,4 f’(x) = lim,.,+ g’(x) = 0; then we are directed 
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Fig. 5.5 L’Hopital’s rule at y 
a glance 

The chords marked a, b, c, d are 
parallel to the tangents at the 
points on the curve marked A, 
B, C, D respectively. It shows 
graphically why the slope of the 
chord has the same limit as the 
slope of the tangent. 


to the limit limy+a,+ f’(x)/g"(x). Again it could happen that lim,24 f(x) = 
limy—+a+ g(x) = 0. As long as numerator and denominator have the limit 0 we may 
differentiate them, until a limit is found that we can easily compute. 

The iterative use of L Hopital’s rule gives an easy proof of the approximation prop- 
erty of Taylor polynomials stated in Sect. 5.7 under the heading “Higher derivatives 
and Taylor polynomials”. 


Proposition 5.14 Let f : A — R, where A is an open interval, and let c € A. 
Assume that the derivatives f'(c), ..., f”(c) all exist and define 


E(h) = f(c +h) - (r0 + aL'@h + 5 f"on feet — fm o") 


for all h such that \h| is sufficiently small. Then 


_ Eth) 
lim = 
h>0 fm 


0. 


Note that 4 can be positive or negative, but we require that c + h is in the interval A. 
That is why we want |h| to be “sufficiently small’. 

The assumption that f has derivatives at c up to order m means that f is (m — 1)- 
times differentiable in some open interval containing c, and f"~" is differentiable 
at c. 


Proof of the Proposition Differentiating E(h) repeatedly with respect to h we 
obtain, for 7 = 1, ...,m — 1 and for all A such that |h| is sufficiently small, 


E)(h) = fX(C +h)— (F% 4 1 G40 eh ee u : sr) ; 
1! (m — j)! 


from which we see that E“ (0) = 0 for j = 1, ...,m — 1. We also have (convenient 
to use Leibniz’s notation here) 
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di m! j 
A” = — fs for j =0,1,2,...,m, 
dhi (m— j)! 


so that ; 

de, . 

aa pg =0,1,2,...,m—1. 
Using L’Hopital’s rule iteratively (with a “wait and see” approach to the existence 
of the limits) now gives 


EO) 2 Og Fo Eta Oar On 


lim 
Se 0 mlh m| h>0 h 


and this is 0 by definition of derivative (and it also tells us that E“” (0) = 0). This 
completes the proof. 


Let us write x — c for in the formula for E (1). We obtain the conclusion 


li f(x) — Pu(x, c) _ 
im —,__,_,—_ = 


Pare (x a= cm 


This describes admirably how the approximation to f(x) by the Taylor polynomial 
P(x, c) improves sharply as x approaches c. On the other hand there is no reason to 
think that the approximation improves if we hold x fixed and increase m (assuming 
we have the derivatives). This question is partly settled by Taylor’s theorem proper 
in a later chapter. 


5.8.5 Application to Maxima and Minima 


If the derivative of f at c is zero, the examination of higher derivatives at c can 
sometimes resolve the question as to whether c is a local maximum point or a local 
minimum point. 


Proposition 5.15 Let f : Ja, bl ~ Randleta <c < b. Assume that f is (m — 1)- 
times differentiable, that f‘)(c) = 0 for j = 1,2, ...m —1, but that fc) exists 
and is not 0. In addition to all this assume that m is an even number. The following 
conclusions then hold: 


(1) If f™ (©) > 0 then c is a strict local minimum point. 
(2) If f™ (©) <0 then c is a strict local maximum point. 


Proof By Proposition 5.14 we have 


1 
f(ct+ h) = fo) = — F(a” an E(h) 
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where the error term E(h) satisfies 
E(h) 
im —— = 
ho Am 
For h # 0 we can write 


fle+h)— fo) _ 
hm 


E(h) 
hm 


1 (m) 
f(c) + 
mM. 


and we know that f“”) (c) 4 0. Since m is even we must have h” > 0, both forh > 0 
and for h < 0. We conclude that there exists 6 > 0, such that f(c +h) — f(c) #0 
and has the same sign as f“")(c), for all h that satisfy 0 < |h| < 6. This is precisely 
the sought-for conclusion. 


5.8.6 More on L’Hopital’s Rule: The co/00o Version 


Sometimes we consider L’ Hopital’s rule as resolving the indeterminate form 0/0, an 
expression that is really quite meaningless. Since we are indulging in meaningless- 
ness we might suggest some other indeterminate forms, for example 


These can often be resolved by some judicious manipulations combined with 
L Hopital’s rule. However there is a version of the rule directly applicable to 00/00 
and, as we shall see, it turns out to be very useful. 


Proposition 5.16 Let f :]a,b[ > R, g: ja, bl > R be differentiable functions 
such that g'(x) 4 0 for all x in its domain of definition. Assume that 


lim g(x)=o, lim fo) =f; 
x—>at x a+ g(x) 
Then 
im 2 
im =": 
wat g(x) 


A similar conclusion holds for lim,-.,— f (x)/g(x). 
The rule also holds if a = —oo orb = &; or ift = ow ort = —o. 


Note that we made no assumption about lim,.,+ f(x). This is not a mistake. 


Proof of the Proposition We shall only consider the case when ¢ and a are finite 
numbers. The other cases are left to the reader to complete. 
Let ¢ > 0. Since lim,_,,4 f’(x)/g'(x) = f, there exists 6, > 0, such that 
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for all z that satisfy a <z<a+. 
Leta <x < y <a+,. It follows by Cauchy’s form of the mean value theorem, 


that 
f(x) -— fQ) 


g@)—20) 


<6, 


which we rewrite in the form 


f(x) -— fO) 
E< 
g(x) — g(y) 


<f+é. 


Now keep y fixed (if desired we could fix y asa + 55 , but the thing we do require 
is thata <x < y<a+6}). Since lim,_,,, g(x) = 00 we find that g(x) > O and 
g(x) — g(y) > 0 when x is sufficiently close to a, for example fora < x <a-+ do, 
and then we have, fora < x < y < 6; anda <x < a+ db, that 


(g(x) — g(y))(t —£) < f(x) — FO) < (g(x) — 80) +8). 


Dividing by g(x) (which is positive) gives 


a(y) f(x) — fO) gy) 
(1 ro OF aa <(1 Hy )ete 


and therefore 


fi) (: em) fx) = fQ) (1 em) 
ot su ee eo ey 


As x — a+ the left-hand member of the inequalities tends to t — ¢ and the right- 
hand member to ¢t + e (recall that we keep y constant). Hence there exists 63 > 0, 
such that the left-hand member is above t — 2¢ and the right-hand member below 
t + 2e for all x that satisfy a < x <a+63. Let 6 = min(do, 63). Ifa<x<a+é6 
we find 

f(x) 


t—2e < —— <f42e. 
g(x) 


This says that lim,+,+ f(x)/g(x) = t and concludes the proof of the first claim. 
The proofs of the remaining claims are left to the reader. 


The proof raises some interesting speculation about the meaning of “for each ¢ 
there exists 5’’. It is too simple to say that 6 is supposed to be a function of e. In the 
above proof we first chose 6), in a non-explicit fashion, from the set of all possible 
numbers that would work for the limit lim,.,, f’(x)/g’(x). Then y was chosen 
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rather arbitrarily. A workable value was defined in an aside but that was not really 
necessary. Finally we found a 6 that worked. 

Where desired and possible we can try to define the quantities we use by functions 
(such as using the max function). But at times we have to say, as in effect we did 
at the beginning of the above proof, “here is a set (of usable 6’s), known to be non- 
empty; let us choose one”. Sometimes it is simply not very helpful to try to see 6 as 
an explicitly computable function of e. 


5.8.7 Exercises 


1. Calculate the following limits: 


(@) li xt—x3-—x4+1 
a im 
xol x4 — 3x3 42x24+x%-1 
li Vl+x—-1 
in — 
x70 YI+x-1 
(c) lim Vx2+x—-—x. 
x—>0O 
2. Calculate the following limits. Use your school knowledge of the circular func- 
tions sin x and cos x and their derivatives (or refer to Sect. 5.1, Exercise 2). 


(b) 


| 1 
(a) lim — — —— 

x>0 X sin x 
(b) ii 1 1 

im — — 

x>0 x2 x sinx 

: 1 1 1 
(c) lim 


3 


x>0 6x Xx? 2 


x2 sinx’ 


3. Exploiting only two properties of the exponential function: 


lim e* =oo and —e* =e", 

X—0O 
show that for any natural number n we have lim,_, 5. e*/x” = 00. 
Note. The conclusion demonstrates the proverbial growth of the exponential function in a 
graphic way; it overpowers any polynomial. 

4. Let f be twice differentiable in an interval A, let a be a point in A and sup- 
pose that f”(a) 4 0. Show that the tangent to the graph y = f(x) at the point 
(a, f (a)) does not cross the graph at (a, f (a)). Show, in addition, that there exists 
5 > O, such that the tangent and the graph have no common point in the interval 
ja —6,a+6[, except atx =a. 

5. Suppose the function f is differentiable in an interval A and let a € A. Let A 
and jz be distinct numbers. Show that 


170 


10. 
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f(a) = lim fla+ ah) ~ flat uh) 
h-0 (A—p)h 


Note. A case important for numerical differentiation is } = 1, 4. = —1. More general formulas 
are known, approximating the first, and higher, derivatives. See the next exercises. 


. Suppose the function f is differentiable in an interval A and let a €¢ A. Show 


that 


—f(a+2h)+8f(a+h) —8f(a—h)+ f(a—2h) 
12h 


f(a) = lim 


. Suppose the function f is twice differentiable in an interval A and let a € A. 


Show that 
ees 2f(@) + f@—h) 


f'(a) = =~ 


. (a) Show that for all positive integers n 


, J=0,1,..,2-1 
Ye ve ate ital Fon. 


Hint. Expand (1 — x)” by the binomial rule. Repeatedly differentiate, but 
with a twist. 

(b) Suppose the function f is n times differentiable in an interval A and let 
a € A. Show that 


1 n 
f™@ = lim os Yievi(") fe + kh). 
k=0 


. We can define the left derivative of f at c, denoted by D; f(c), and the right 


derivative D, f (c), in the obvious way: 


Dif (c) = Jim Pen ie D, f (c) = im fO)- FO 


e+ x-C 


when these limits exist. Now suppose that f is differentiable in an interval 
Jc — a, c[, continuous in Jc — a, c] and the limit lim,_,.__ f’(x) exists and is a 
finite number A. Show that D; f(c) exists and equals A. A similar result holds 
for the right derivative. 

Show that the result also holds if A = oo or —on, if we allow a derivative to be 
infinite (the definition should be obvious). 

The previous exercise has an interesting consequence. Suppose that f is differ- 
entiable everywhere in an open interval A. Show that discontinuities of f’, if 
there are any, are never jump discontinuities. 

Note. f’ can be discontinuous. An example was exhibited in Sect.5.5, Exercise 4. 


5.8 


11. 


12. 


13. 
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Show that if f is differentiable everywhere in an open interval A, and if f’ is 
increasing, then f’ is continuous. 

Find an example of differentiable functions f and g with domain R, such that 
f(O) = g(0) = 0, lim,.9 f(x)/g(x) exists, but lim,.9 f’(x)/g'(x) does not 
exist. 

Suppose that f has n derivatives at a point c (meaning that f has n — 1 derivatives 
in an interval A containing c, and that f“~) is differentiable at c) and that 
f(c) = 0. One expects that the function f(x)/(x — c), extended to be f’(c) at 
c, should have (at least) n — 1 derivatives at c. This is quite tricky with such 
minimalistic premises. One can prove the following proposition, in which, for 
simplicity we take c = 0: 


Suppose that f has n derivatives at 0. Suppose further that f (0) = 0. Let 
g(x) = f(x)/xforx # 0,and g(0) = f’(0). Then g hasn — | derivatives 
at 0 and g®(0) = f®*)0)/(K + 1) fork = 0,1,...,n — 1. 


You can deduce the proposition from the following two steps: 
(a) Show that 


g(x) = 


’ 


Cy he) 
x x 


and d 
Fy OD) = PT FO), 


forx A Oandk = 0, 1,...,2 — 1. The condition f (0) = 0 is not needed for 
this step. 


(b) Show that 


~ 


1 
li (k) = k+D (Q 
ne ele gs 


fork =0,1,...,n —1. 

Note. If f has continuous derivatives up to order n a much simpler proof can be given 
by the fundamental theorem of calculus, a key result of integration theory. This is only a 
slight strengthening of the premises. See Sect. 12.2, Exercise 4. 


5.9 (©) Multiplicity 


Let us assume that the function f has derivatives of all orders. Everything we are 
going to say can be formulated for functions with finitely many derivatives, but 
requires more circumlocution. Much of the material of this section will be developed 
through exercises. 


We say that a point c is a root of f(x) = 0 with multiplicity m, if f(c) =0 


fork =0,...,m —1 and fc) £0. As usual f denotes f. We also speak of c 
as an m-fold zero of f. A 1-fold zero is usually called a simple zero. A zero with 
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multiplicity 2 or higher is called a multiple zero. When we talk of a zero of finite 
multiplicity we mean one with multiplicity m for some positive natural number m. 
When roots or zeros are counted according to multiplicity, it means that an m-fold 
zero is accorded the number m; it is viewed as m zeros for the purpose of counting 
them. 

In general a point c of a set B of real numbers (not necessarily here an interval) is 
called an isolated point of B if it is not a limit point of B. This is equivalent to saying 
that there exists h > 0, such that c is the only point of B in the interval Jc — h, c + Al. 

We now state the following facts. For simplicity we suppose that f is defined in 
an open interval A. 


(a) The point c is a zero with multiplicity m if and only if we can write f(x) = 
(x — c)" g(x), where g is a function on the same domain as f, g has derivatives 
of all orders and g(c) 4 0. 

(b) A zero of finite multiplicity is an isolated point of the set of zeros (or, in short, 
an isolated zero). 

(c) The function f changes sign at a zero of multiplicity m if and only if m is odd. 


5.9.1 Exercises 


1. Prove claims (a), (b) and (c). 

Hint. Section5.8, Exercise 13 could be useful. 

2. Suppose that all zeros of f have finite multiplicity. Let a and b be points of A, 
such that a < b and neither point is a zero. Show that f has at most finitely many 
zeros in Ja, b[. 

Hint. One can use the Bolzano—Weierstrass theorem. 

3. In the previous exercise, if f(a) and f(b) have the same sign, show that the 

number of zeros in Ja, b[, counted by multiplicity, is even. If f(a) and f(b) have 
opposite signs, show that the number of zeros in Ja, b[, counted by multiplicity, 
is odd. 
Note. This extension to the intermediate value theorem, valid for functions with enough deriva- 
tives and for which all zeros are known to have finite multiplicity, is very useful. It applies to 
all polynomials, for example. More generally it applies to so-called analytic functions, studied 
in complex analysis. 

4. Let f be a polynomial with odd degree. Show that the number of real zeros of f, 
counted by multiplicity, is odd. 

5. Suppose that f has m zeros and g has n zeros in A, counted by multiplicity. Show 
that fg has m +n zeros in A, counted by multiplicity. 
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6. Suppose that f has m zeros in A (an open interval, recall), counted by multiplicity. 
Show that f’ has at least m — 1 zeros in A, again counted by multiplicity. 
7. Show that the polynomial 


dq” 
dx" 


P,(x) = <0? — 1)" 
has n distinct zeros in the interval ]—1, I[, all simple. 
8. Prove Descartes’ rule of signs. We consider a polynomial equation 


Anx" + dy_jx" | +--+ +a) =0 


with real coefficients. Without loss of generality we can assume thata, > 0. We let 
o denote the number of sign changes in the sequence (a, Gy_1, ..., 2g) (omitting 
any that are 0 for the purpose), and let r, be the number of strictly positive roots, 
counted by multiplicity. The rule of signs says: ry. < o and o — r, is even. 

A proof might use the following steps: 


(a) Show that the difference o — r, is even. 

(b) Complete the proof by induction on the degree of f. The case of degree 
1 is easy. Assuming that the rule of signs holds for polynomials of degree 
less than or equal to n — 1, we let r!, be the number of positive roots of the 
derivative f’, and let o’ be the number of sign changes in the coefficients of 
f'. By the induction hypothesis, r/, < o’. Deduce that r; < o + | and use 
(a) to getry, <o. 


5.9.2. Sturm’s Theorem 


The intermediate value theorem can tell you that a continuous function has at least 
one root in a given interval. Consideration of multiplicity can tell you the parity of 
the total root-count. Sturm (1829) developed a method that can be used to compute 
the actual number of real roots of a polynomial equation in an interval. It exploits 
the Euclidean algorithm, which is used to find the highest common divisor of two 
polynomials of one variable. We shall summarise the Euclidean algorithm here; 
however, the reader unfamiliar with it should probably consult an algebra text. 

Let f and g be polynomials such that deg(g) < deg(f). We can divide g into f, 
producing a quotient g with degree deg( f) — deg(g), and a remainder r which, if not 
zero, has degree strictly less than deg(g). This entails that f = gq +r, and either r 
is zero or it is the unique polynomial, with degree less than deg(g), for which this 
equation holds for some q. Putting it differently, r is the unique polynomial, with 
degree less than deg(g), such that g divides f — r. Let us denote the remainder, when 
f is divided by g, by rem(/f, g). 
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The Euclidean algorithm computes the greatest common divisor of f and g 
in the following way. We set ro = f and 7; = g and define recursively rp42 = 
rem(rx, 7%41), provided neither r; nor r,,, is zero. The degrees of the polynomi- 
als rz, are strictly decreasing with increasing k (except that rp and r; could have the 
same degree); so the process ends in a finite number of steps. This implies that there 
is a least integer m such that 7,4; = 0. It is then not hard to see that r,, is the highest 
common divisor of f and g. 

Sturm’s theorem counts roots without multiplicity. Itis best described by supposing 
that f and f’ have highest common divisor 1. If this is not already the case we can 
divide f by the highest common divisor of f and f’. This does not change the roots 
of f but converts all multiple roots into simple ones. 

Given that the highest common factor of f and f’ is 1, we apply the Euclidean 
algorithm with a slight twist. We set pp = f, p; = f’ and then, recursively, py... = 
—rem(px, Px+1). After a finite number of steps the sequence terminates, with p,,+1 = 
0, say. As it is easy to see that px, is, up to sign, the same as rz, and as the highest 
common factor of f and f’ is 1, the polynomial p,, must be a non-zero constant. 

Now for each k we have 


Pk-1 = Pkdk — Pk+1 


for a certain polynomial g;. It follows that for each j, the polynomials p; and pj+1 
cannot have a common zero, for if they did, it would be a zero of every polynomial 
Px, which is impossible since p,, is a non-zero constant. Moreover, if for some x, 
and some k in the range 0 < k < m, we have p;(x) = 0, then clearly pz41(x) and 
Px—1(x) have opposite signs. These properties of the chain po, P1, ..-, Pm are central 
to the proof of the following result. 


Proposition 5.17 (Sturm’s theorem) Suppose that a < b and neither a nor b is 
a root of f (where f is assumed to have only simple roots). With po, Pi, ---; Pm 
defined as above, for each x we let o(x) be the number of sign changes in the 
sequence po(x), pi(X), .--; Pm(x) (ignoring zeros). Then the number of roots of f in 
the interval la, b[ equals o(a) — a(b). 


Exercise Prove Sturm’s theorem. You can use the following steps: 


(a) Show that at a root of f the number o (x) decreases by 1, (when the root is passed 
with increasing x). 

(b) Show that if k > 1, then passing a root of p,;, that is not also a root of f, does 
not change o (x). 

(c) Deduce Sturm’s theorem from (a) and (b). 


This result is so pretty that an illustrative example is called for. 


Example Count and roughly locate the roots of the polynomial 4x+ — 16x? + 
11x? — 16x +7. 


A calculation (done by hand) revealed the following. 
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po(x) = 4x* — 16x3 + 11x? — 16x +7 


pi(x) = 16x? — 24x? + 22x — 16 


gga Be 
R= ah ot ae = 
@e 1214 gees 
Eg ng 


pa(x) is a negative constant (indicating that all roots are simple). 


Tabulating the signs at x = 0, 1, 2, 3, 4 and +oo (that is, for x sufficiently high 
or low to stabilise the signs) we find the following table of signs: 


—0o0 0 1 2 3 4 +00 


Po + +---+ 4+ 
Pin = Sa ee 
Pe Fm Ht OF 
pe SS 
P4 


There is therefore exactly one root between 0 and 1, exactly one root between 3 
and 4, and no subsequent positive root. There is no negative root. It is fascinating 
to see the minuses drift upwards and vanish, like bubbles rising from a submerged 
wreck. 

By way of comparison, Descartes’ rule of signs (Exercise 8) indicates none, two 
or four positive roots, and no negative roots. The intermediate value theorem (with 
the supplement covered in Exercise 3) indicates (by the first row of the table) an 
odd number of roots between 0 and 1, and an odd number of roots between 3 and 
4, counted with multiplicity. This does not rule out three roots between 3 and 4, for 
example; nor does it rule out two roots between 4 and +00. 


5.9.3 Exercises (cont’d) 


9. Find the number of real roots of the equation 
x — 20x +1=0 


along with their signs. 
Hint. Use Descartes’ rule of signs (Exercise 8) and the intermediate value theo- 
rem (Exercise 3). 

10. Find the number of real roots of the equation 
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5x* — 10x? +2x+1=0 


in the given intervals: 


(a) -—oo<x<-l 
(b) -l<x<0O 
(c) O<x<l 


(d) l<x<oam. 


11. Find the number of positive roots and the number of negative roots of the equation 


xt +x3—2x-3=0. 


5.9.4 Pointers to Further Study 


— Theory of equations 


5.10 Convex Functions 


We first give a geometric definition of convex function based on the graph of the 
function, viewed as a curve. The line segment joining two points on a curve is called 
a chord, this being the standard usage in the case of a circle. 


Definition Let A be an interval. A function f : A — R is said to be strictly convex, 
if, for each pair of points a and b in the interval A, with a < b, the graph of f for 
a <x <b lies strictly below the chord joining (a, f(a)) and (b, f(b)). 

Plain convexity is a slightly, but significantly, weaker notion. 


Definition Let A be an interval. A function f : A — R is said to be convex, if, for 
each pair of points a and b in the interval A, with a < b, the graph of f between a 
and b does not go above the chord joining (a, f(a)) and (b, f(b)). 


Our focus is entirely on strict convexity.” At the level of single-variable calculus 
it is strict convexity that has all the interesting applications. In some calculus texts a 
strictly convex function is called concave-up, a term that describes it admirably. Its 
uses explored here (some of them in the exercises) include some interesting deduc- 
tions about solutions of equations, minimisation problems, the Legendre transform, 
inflection points and (in the next section) a sharp form of Jensen’s inequality. Last 


This focus produces a tiresome need to repeat the words “strict” and “strictly”. An alternative 
would have been to use the term “convex” instead of “strictly convex” and in the few places where 
convexity of the not necessarily strict kind is mentioned, to use “weakly convex”. There is a precedent 
in some of the sources and it is consistent with the rule that the more useful version should have 
the simpler name. But it is not consistent with multivariate calculus where the greater usefulness of 
strict convexity compared to convexity is not so apparent. 


5.10 Convex Functions 177 


but not least, an understanding about where a function is strictly convex and where 
strictly concave is a great aid to sketching its graph, still a useful mathematical skill. 

Now we translate strict convexity into algebra. One way to write the equation of 
the chord is to “proceed from the point (a, f(a))” thus 


= (1-10 
y= 
a 


= Jo a) + f(a). 


Using this we can write the condition that f is strictly convex as follows. For all a, 
b and x in A such that a < x < b we require 


f(b) — f@ 
a 


jay «(02 


Jo a) + f(@), 


or equivalently 
I@)-I@ _ fO-fO 


x-a b-a 


(5.3) 


This inequality asserts that the slope of the chord is an increasing function of its right 
endpoint (just think of b as variable). 

The inequality (5.3) is algebraically equivalent to each of two others; like it they 
each compare the slope of two chords. They are 


f(b) — f@ - f(b) — f(x) 
b-a b-x : 


(5.4) 


which asserts that the slope of the chord is an increasing function of its left endpoint, 


and 
f(x) -— f@ Z FO) — FR) 


x-—a b-x 


(5.5) 


Itis anice exercise for the reader to show that all three inequalities are algebraically 
equivalent. Any one of them implies the other two. Geometrically this is obvious, as 
the three quantities being compared are the slopes of three chords forming the sides 
of a triangle whose vertices are the points (a, f(a)), (x, f(x)) and (b, f(b)) on the 
curve y = f(x). A picture makes this rather obvious. 

There is even a fourth version of the same inequality, also easy to obtain, that 
rather obviously expresses the claim that the graph is below the chord, namely 


b— = 
fay < (F=*) ro + (F=*) rH. (5.6) 


Exercise Prove that the inequalities (5.3)—(5.6) are algebraically equivalent. 


Putting this together we can set out a rather wordy necessary and sufficient condi- 
tion for strict convexity of the function f; that for every three points a, x and b in the 
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interval of definition, such that a < x < b, at least one of the above four inequalities 
is verified (and if one is true then all are true). 
If, however, f is differentiable there is a much simpler criterion. 


Proposition 5.18 A differentiable function f is strictly convex if and only if f’ is 
strictly increasing. 


Proof Suppose that f is strictly convex and differentiable. Let a < b. It follows 
from the inequalities that the quotient (f(x) — f(a))/(« — a) is a strictly increas- 
ing function of x for x > a, and the quotient (f(b) — f(x))/(6 — x) is a strictly 
increasing function of x for x < b. Hence 


f(a) = lim f@)— F@ 7 f(b) — f(a) sai 


X>at x-—a b-a x—>b— b 


= f'(b) 


e. I= TR) 
nO —_—— 
=24 


giving f’(a) < f’(d). 

Conversely suppose that f’ is strictly increasing. Let a, x, b be in the interval of 
definition of f and suppose that a < x < b. By the mean value theorem there are 
points y between a and x, and z between x and b, such that 


fO=f@™ _ py 
x—a 
and () ( 
: — ») i @. 
—x 


But f’(y) < f’(z) so we find 


f@)—-F@ _ fe)— FO) 


x-—a b-x 


This is inequality (5.5) and shows that f is strictly convex. 


As an immediate consequence we have the most useful test for strict convexity; 
it is based on calculus rather than geometry, but requires second derivatives. 


Proposition 5.19 A sufficient condition for a twice differentiable function f to be 
strictly convex is that f(x) > 0 for all x in the interval of definition. 


5.10.1 Tangent Lines and Convexity 


Another useful conclusion, and a fifth necessary and sufficient, purely geometric 
condition for strict convexity, but based on the assumption that the function is dif- 
ferentiable, is the following. 
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eA f(x)—fla f(a) f(a) 
F(x f(b)-Ffla 
cjeta SS tt 
£(6)=f(a) 

< f)- F(a) 


b 


f(a - 6) + fC) f((1 —t)a + tb) 
< f(z) < (1-2) f(a) + tf (0) 


Fig. 5.6 Five views of strict convexity 


Proposition 5.20 Let f be differentiable in the open interval A. A necessary and 
sufficient condition for f to be strictly convex is that for every c in A, the tangent 
line to the curve y = f (x) at the point (c, f (c)) lies wholly below the curve itself, 
except that they both contain the point (c, f (c)). 


Proof Suppose that f is strictly convex. We know that (f(c) — f(x))/(c — x) is 
a strictly increasing function of x for x < c; and that (f(x) — f(c))/(@@ —c) isa 
strictly increasing function of x for x > c. Hence if x < c we find 

a ey oe) 

n —RH—, = 


flo) = Ff) 
+ _ < li r 


C—X t>c— Cc 


fC) 


which implies 


f(x) > fl) + fl —¢) 
and if c < x we find 


ii fO-fO — fa)—- flo 
im < 


et LC x—C 


ro: 


which implies 


fOSToOrs Ouse), 


This shows that the condition is necessary. 
The reader is invited to finish the proof by showing that the condition is sufficient 
for strict convexity given that f is differentiable. 


The five geometrical conditions for strict convexity are illustrated in Fig. 5.6. 
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5.10.2 Inflection Points 


A function f such that —f is strictly convex is called strictly concave (in some 
calculus texts it is called concave-down). Let f be differentiable in the interval A. 
A point (a, f(a)) on the curve y = f(x) is called an inflection point of the curve 
if there exists h > 0, such that f is strictly convex [respectively, strictly concave] 
in the interval Ja — h, al, and strictly concave [respectively, strictly convex] in the 
interval Ja,a + h[. 

In other words the function switches from strictly convex to strictly concave, or 
from strictly concave to strictly convex, at the point a. We say loosely that f has an 
inflection point at a. Inflection points are illustrated in Fig. 5.7. 

For some reason the notion of inflection point is only applied to differentiable 
functions; there has to be a tangent. Properly an inflection point is a property pos- 
sessed by a plane curve and not just a graph; it is a point where the curvature changes 
sign. The concept of curvature really belongs to the study of the differential geometry 
of plane curves. 

A necessary condition for an inflection point at a is that f’ has either a local strict 
maximum or a local strict minimum at a. This is not sufficient. Again if f is twice 
differentiable it is necessary that f’”(a) = 0, but still not sufficient. We have to force 
f" to change sign, to be strictly positive on one side of a and strictly negative on the 
other. 

A problem left to the exercises is to find a sufficient condition that f has an 
inflection point at a that builds on higher derivatives of f at a alone. 

We often want to sketch the graph of a given function. Nowadays there are many 
good software packages that do this. A good sketch prepared without the help of a 
computer should show roughly where the function is strictly convex and where strictly 
concave. This means having some idea of where f” is positive, where negative and 
where the inflection points are that separate these regions. 


y =sinz Intervals of convexity 
(2n —1)t <a < 2nt 


Intervals of concavity Inflection points 
2nn <a < (2n4+1)r c=nn, n=0,t1+243,... 


Fig. 5.7 Inflection points of y = sin x 
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5.10.3 Exercises 


1. Let f be a strictly convex function defined in an open interval A and let c be a 
point in A. Show that the limits 


tim LO-LO 
et 


t>c— Cc 


and lim LO- FO —£© 


t>c+ t—c 


both exist and that the first is less than or equal to the second. These limits are 
the left and right derivatives, D; f (c) and D, f (c). Give an example to show that 
they do not have to be equal. 

2. Show that a strictly convex function, defined in an interval A, is continuous if A 
is open, but that continuity may fail if A is not open. 

Hint. One way is to use the previous exercise. 

3. The function in Proposition 5.20 was assumed to be differentiable. Without 
assuming differentiability it is possible to say something similar, and obtain a 
sixth necessary and sufficient, purely geometric condition for strict convexity. 
Prove the following: 


A function f , defined in an open interval A, is strictly convex if and only if 
it satisfies the following condition: for every c in A there exists a straight 
line through the point (c, f (c)) that lies wholly below the graph of f, 
except that the line and graph both contain the point (c, f (c)). 


4. Let f be a convex function and suppose that there exist points a < x < b, such 
that the point (x, f(x)) lies on the chord joining (a, f(a)) and (b, f(b)). Show 
that the whole of the chord lies on the graph of f. So the graph of a non-strictly 
convex function differs from that of a strictly convex one by including some 
straight line segments. 

Hint. Consider how the inequalities (5.3)—(5.6) should be modified for a function 
that is convex but not necessarily strictly convex. 

5. Let f be a strictly convex function on the interval [0, oo[ and suppose that 
f(O) = 0. Show that f satisfies 


f(a+b) > fla) + f(b) 


for all positive a and b. 

6. Show that if f is a strictly convex function and a and b are constants, then the 
function f(x) + ax + bis also strictly convex. 

7. Suppose that a function f satisfies 


a+b 1 1 b 
r( 5) ) <5 f+ 570 


for all a and b in its interval of definition. 
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10. 


11. 


12. 


13. 
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(a) Show that 
f(ta+ d- t)b) <tfa@+d-fhf) 


for all a and b and for all dyadic fractions t in the interval 0 < t < 1; that 
is, for all t of the form t = k/2” where n is a positive integer and k is an 
integer in the range 1 <k <2” —-1. 
(b) Show that if f is continuous then f is strictly convex. 

Hint. For item (a) use induction with respect to n. To get started figure 
out how to handle the case t = i. For item (b) use the fact that every real 
number in the interval [0, 1] can be approximated arbitrarily closely by 
dyadic fractions, as is shown by the binary representation of real numbers, 
analogous to the decimal representation but using 2 as a base instead of 10. 
You will need to figure out why the inequality remains strict when f is an 
arbitrary real number in the interval 0 < t < 1. 


. Show that a twice differentiable function is convex (of the not necessarily strict 


kind) if and only if its second derivative is non-negative. 

Note. This is a case where convexity is simpler than strict convexity. The counterpart of 
Proposition 5.19 is a necessary and sufficient condition for convexity, whereas Proposition 
5.19 is only a sufficient condition for strict convexity. 


. Let f bea strictly convex function. Show that a straight line intersects the graph 


of f in at most two points. In other words, given constants a and b, the equation 
f(x) = ax +b has at most two roots. 
Let f be a strictly convex function defined in an interval A. 


(a) Show that if f attains a minimum it does so at a unique point. 

(b) Suppose that there exist distinct points a and b in A such that f(a) = f(b). 
Show that f attains a minimum (which by (a) occurs at a unique point). 

(c) Let c be an interior point of A (that is, c is not an endpoint). Show that 
f attains a minimum at c if and only if D; f(c) < 0 and D, f(c) = 0 (see 
Exercise 1). 

(d) Suppose that c is an interior point of A and that f is differentiable at c. Show 
that f attains a minimum at c if and only if f’(c) = 0. 


For the purposes of this exercise we shall call a line that cuts a curve y = f(x) 
a secant line. A secant line meets the curve and crosses it; it contains points 
(x1, y1) and (x2, yo), such that y; < f(x;) and y2 > f (x2). Note that this is 
slightly different from the common usage, which requires a secant to meet the 
curve in two points, an assumption not made here. 

Let f be strictly convex in the whole real line. Show that a secant line that is 
parallel to some chord of the curve y = f(x) cuts the curve in two points. 

Let f be strictly convex and defined in the whole real line. Suppose that f attains 
a minimum. Prove that lim,-._.9 f(x) = lim, f(x) = o. 

Let f be defined in an open interval A and let c be a point in A. Show that 
the following is sufficient for f to have an inflection point at c: the derivatives 
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fP(© exist up to j =m, fY(c) = 0 for j = 2,...,m — 1, f™(c) 4 Oandm 
is odd. 


In the following series of exercises we study the Legendre transform. This is an 


important construction associated with convex functions that has many applications, 
both theoretical and practical. 


14. 


15. 


16. 


17. 


18. 


Let f be strictly convex and differentiable in the open interval A. Let c = inf f’ 
and d = sup f’. For each p in the interval B :=]c, d[ let g, be the function 
8 p(x) = px — f(x). 


(a) Show that g, attains a maximum value at a unique point x, in A. 
(b) For each p in B we let 


fx(P) = PXp — f (Xp). 


The function f, is called the Legendre transform of f. Now suppose that f 
is twice differentiable and that f” > 0. Show that 


(fA) =F" 


and deduce that f, is strictly convex. 
(c) What if the second derivative does not exist? Can you prove the formula in 
item (b) from first principles, that is, by arguing from the difference quotient? 
(d) Show that f... = f. Algebraically, the operation of passing from f to f, is 
an involution. The same operation applied to f, brings one back to f. 


Prove Young’s inequality. Given that f is strictly convex and differentiable, then 


PX S fx(p) + FO) 


forall x in A and pin B (where A and B are the domains of f and f,, respectively). 
Show that the power function x“, with a > 1, is strictly convex in its interval of 
definition 0 < x < oo. It therefore has a Legendre transform. Obtain nice results 
by computing the Legendre transform of x“/a and writing down the result of 
Young’s inequality. 

Try the previous exercise for the function e*. You will need some knowledge of 
the exponential function and natural logarithm. 

Let f be a strictly convex function defined in an open interval A. Let B be the 
set of all real numbers p, such that the graph y = f(x) has achord with slope p. 


(a) Show that B is an interval. 
(b) Show that for each p in B the function px — f(x) attains a maximum at a 
unique point x, in A. 
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This shows how the Legendre transform can be defined for strictly convex func- 
tions that are not everywhere differentiable. We set f,(p) = pxp — f (Xp) for 
each p in B. However, f,, though convex, may fail to be strictly convex. 


(c) Let f be defined by 


_ f(@—1? ifx <0 
(OS 


Show that f is strictly convex and compute f,,. Show that the latter is convex 
but not strictly convex. 


5.11 (©) Jensen’s Inequality 


Jensen claimed that his inequality implied almost all known inequalities as special 
cases.° If this was only partially true it would make it a remarkable object of study. 
Actually Jensen’s inequality is a natural enough extension of the fourth inequality 
characterising strictly convex functions, inequality (5.6). 


Proposition 5.21 Let f be a strictly convex function with domain A, let x;, (j = 
1, 2, ...,n), be points in A, and let t;, (j = 1,2, ...,n), be positive numbers such that 
yi t; = 1. Then 


fous) < >it fy) 
j=l j=l 


Equality holds if and only if the numbers x; are all equal. 


Proof We set c = aan t;x;. Because the numbers f¢; are positive and sum to 1, it 
follows that c belongs to the interval A. By the result of Sect.5.10, Exercise 3, there 
exists a line through (c, f(c)), that lies wholly below the graph y = f(x), except 
that both the line and the graph contain the point (c, f(c)). Let the line have the 
equation y = f(c) + m(x — c). Then for all x 4 c we have 


fc) +m(x —c) < fx) 


whilst for x = c we have equality. We now find 


n 


Yi 4(FO + m(x; —c)) = Yt f(,) 
= 


j=! 


3This is stated in the book “A Course of Analysis” by E. G. Phillips, originally published in 1930. 
I don’t know what the author’s source was; maybe he knew Jensen. 
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with equality if and only if all the numbers x; are equal to c. Since c = )”” 


jal FX) 
and )._, t; = 1 we obtain 


fO <> ti fe). 


j=l 


which is Jensen’s inequality. 


Jensen’s inequality is also valid if f is merely convex. An analogue of Sect. 5.10, 
Exercise 3 holds for (not necessarily strictly) convex functions. In this case a line 
y =m(x —c)+ f(c) can be found that nowhere goes above the graph y = f(x), so 
that Jensen’s inequality results just the same. However, we no longer get the striking 
conclusion that equality holds if and only if all the points x; are equal. 

One of the spectacular applications of Jensen’s inequality is to proving a gen- 
eral form of the inequality of arithmetic and geometric means. This is achieved by 
applying it to the exponential function a*. We have not, so far, rigorously defined 
this function, but we can summarise what we need as follows: 


(a) For a given positive base a, the function a* extends to real x the power function 
a* with rational x as already defined. 
(b) We have the laws of exponents: 


a*'=a‘a’ and (a’)' =a". 


(c) The function a* has an inverse function (a being here a fixed base), the logarithm 
with base a denoted by log_; that is, the equation y = a* inverts to x = log, y. 
(d) The function a* is convex. 


Let x;, (j = 1,2, ..., 2), be real numbers and let t;, (7 = 1, 2, ..., 2), be positive 
numbers such that pee t; = 1. Applying Jensen’s inequality to the exponential 


function 2* we obtain 
n 
Quintiti < » tj" 
j=l 


with equality only if all the numbers x; are equal. By the laws of exponents this gives 


Te" < 42", 
j=l j=l 


Now letaj;,(j = 1, 2, ...,), be positive real numbers and let x; = log, a;. We obtain 
the generalised inequality of arithmetic and geometric means 


n 


n 
tj Z 
[]e) = dai. 
j=l 


j=l 
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as well as the additional fact that equality only holds when the numbers a; are all 
equal. 


5.11.1 Exercises 


1. Give a proof of Jensen’s inequality by induction on the number n. Note that the 
case n = 2 is inequality (5.6). Some care is required to include the conclusion 
that asserts when equality occurs. 

2. The case n = 2 of the generalised inequality of the arithmetic and geometric 
means, 

ab’ <sa+bt 


is sometimes called Young’s inequality. The assumptions are that a, b, s and t are 
all positive, and that s + t = 1. Equality holds if and only if a = b. 


(a) Use Young’s inequality to prove Hoélder’s inequality: 


n n 1/p n 1/q 
San <(Ea)"(H) 
k=1 


k=1 k=1 


where p > 1, gq > 1, (1/p) + C/q) = 1, and for each k we have aq > 0, 
b, > 0. 

Hint. Do it first with the assumption )77_, a; = )-¢_, bf = 1. Apply 
Young’s inequality with a = a?, b = bi], s = 1/p, t = 1/q and sum over k. 
Then remove the assumption. 

Show that equality holds in Hélder’s inequality if and only if the following 
is satisfied: either b, = 0 for all k, or else there exists ¢ such that a, = th, 
for all k. 


(b 


wm 


5.11.2 Pointers to Further Study 


— Convexity theory 
— Inequalities 


5.12 (©) How Fast Do Iterations Converge? 


Given the iteration a, = f (a,—1), and knowing that lim,-,.. ad, = t, where ft is a root 
of f(x) = x, we wish to study the speed of convergence. Can we usefully specify 
how fast a, tends to t? This could be of great importance in deciding whether a 
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method of approximating the solution to an equation is practical. For example, can 
we say approximately how many additional correct decimal digits are obtained in 
passing from a, to a,41? 

The answer depends on studying the derivatives of f. The derivative f’(r) at the 
root being approximated plays a key role. We will assume some knowledge of the 
common logarithm, and once we use Taylor’s theorem. More precisely we use the fact 
that for a twice-differentiable function f the error f(x) — f(a) — f'(a)(x — a), in 
using the first-degree Taylor polynomial as an approximation to f(x), is 
5 f(a - a)’, for some & between a and x (this special case was the content 
of Sect. 5.7, Exercise 2). 

In the following we do not assume in advance that the iteration can be continued 
indefinitely. We assume throughout that f is defined in an open interval A. 


Proposition 5.22 Suppose that f is differentiable in A and that f' is continuous. 
Let t be a point in A, such that f(t) = t, and assume that | f'(t)| < 1. Then there 
exists 5 > 0, such that if |ag — t| < 6 the iteration can be continued indefinitely and 
dy converges tot. 


Proof Since | f’(t)| < 1 wecan choose k so that | f’(t)| < k < 1. Since f’ is contin- 
uous, there exists 6 > 0, such that the interval ]t — 6, t + 6[ is included in the domain 
of definition of f, and such that | f’(x)| < k for all x that satisfy |x — t| < 6. 

Suppose that |ag — t| < 6. Then the iteration cannot quit the interval ]t — 5, t + 6[, 
and so continues indefinitely. For suppose that we have reached a,, without quitting 
the interval. By the mean value theorem we have 


Qn41 —t = f (Gn) —t = fn) — fF) = f' En) Gn — 1) (5.7) 
for some number &, between a, and t. Hence 
lQn41 —t| <kla, —t| 


and since k < 1, the number a,,+, is in the interval ]t — 6, t + 6[. The iteration con- 
tinues indefinitely and satisfies |a, — t| < k”|do — t|, and so a, converges to f. 


The estimate |a, — t| < k"|ag — t| tells us something more that has practical 
importance but is not very precise. Each iteration step contributes at worst roughly 
the same number of additional correct decimal digits, the number obtained in each 
step being approximately — log), k. 


Proposition 5.23 Suppose that f is differentiable in A and that f' is continuous. 
Let t be a point in A, such that f (t) = t, and assume that | f'(t)| > 1. Then, however 
we choose dg, provided ay # t the iteration cannot converge to t. 


Proof There exists k > 1 and ¢ > 0, such that | f’(x)| > & for all x that satisfy 
t-—e<x<t+e. Ifa, € jt—e,t+e[ then |a,4; —t| > kla, —t| according to 
(5.7). After a finite number of steps a; will quit the interval ]t — ¢, t + e[ for some 
j > n.Itis impossible that N can exist, such that for alln > N we have |a, — t| < €. 
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We can obtain much faster convergence under some conditions. Suppose that f 
is twice differentiable in A and that the second derivative f” is bounded in absolute 
value by a constant M. Let ¢ be a point in A, such that f(t) = ft, and assume that 
f'(t) = 0. We can find 6 > 0, such that | f’(x)| < 5 for all x that satisfy t —6 < 
x <t+6.If|ao —t| < 6 then (by the arguments above) |a, — t| is decreasing and 
dy converges to ft. 

The condition that f’(t) = 0 has a profound effect on the speed of convergence. 
Let us suppose that a, converges to ¢ (as it will do if the initial point is near enough 
to t as we have just seen), but we do not assume that |ay) — t| < 6. Since a, tends to t 
the inequality |a, — t| < 6 will certainly hold once n is large enough. Furthermore, 
since f’(t) = 0, it follows by Taylor’s theorem (see Sect.5.7, Exercise 2), that if 
|a, — t| < 6 then 


1 " 
Qn41—t = f(@a)-fO= xf (En) (Qn — a 
where the number &, lies between a, and t. Then we find 
M 2 
lQn41 —_ t| < Flan = t\°, (5.8) 


which we modify to read 


| t| | t| 
< . 
Qn+1 an 


Since a, tends to f, there exists no, such that (M/2)|a,, — t| < 1. Set 


M 
p= ln — tl. 


For n > no we have 


M M a” ne 
5 [An tl < (Fle |) =p" o 


To see what this means we consider the common logarithm of the error, namely 
1logi9 |dn — t|. We have 


M 
—1og19 lan — t| > logio (=) Pe ee Oo logo p): 


As before we interpret the left-hand side as the approximate number of correct 
decimal digits. The number on the right-hand side approximately doubles at each 
iteration step (if M is big we would have to add that n must be sufficiently large). With 
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some simplification we could say that the number of correct digits approximately 
doubles at each step, at the very least. 

The type of convergence described here, where the error at step n + 1 is bounded 
by aconstant times the square of the error at step n, as indicated in the inequality (5.8), 
is called quadratic convergence. Itis clearly very desirable if an efficient computation 
method is sought. In the next sections we shall study some examples of this. 


5.12.1 The Babylonian Method 


A striking example of quadratic convergence occurs with the Babylonian method for 
calculating the square root ./c. This is the iteration 


Set 


Then f(./c) = ./c and f’(./c) = 0. The conclusions of the last section tell us that 
Gn — s/c if ao is sufficiently near to ./c, and the number of correct digits approxi- 
mately doubles with each step. 

Let us try this on /3. Take ao = 1. The results are, up to as: 


2, 1.75,  1.73214285714286, 1.73205081001473, 1.73205080756888 


The number of correct digits for az, a3, a4 and as is successively 


roughly as predicted. 

For which ag can we assert that a, tends to ./c? This question is usually tricky. 
But in this case we can start at any positive aj whatever. The inequality of arithmetic 
and geometric means gives 

1 
5(x+=)2 ve 
2 x 
(equality only if x = ./c). If a9 4 Jc then a, > ./c for n > 1. And then ay is 
decreasing for n > | since 
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The limit lim, 45 a, therefore exists and equals the unique positive root of f(x) = x, 
which is /c. 


5.12.2. Newton’s Method 


The Babylonian method for calculating ./c is an instance of Newton’s method of 
approximation, more precisely it results from using Newton’s method to solve the 
equation x* —c = 0. 

Let g be twice differentiable. Newton’s method computes a solution of the equa- 
tion g(x) = 0 by means of the iteration 


a (Gn) 
n+ n 2/(dn) 
Let 
Ofte g(x) 
g(x) 


Ift is asolution of g(x) = Oand g(t) 0, then f(t) = t and f'(t) = 0. We therefore 
have quadratic convergence to ¢ if a; is sufficiently near to t. How near it needs to 
be to ensure convergence is a sensitive and tricky question, and the reader should 
consult works on numerical analysis for further discussion. 


Exercise Verify the claim that if f(x) = x — (g(x)/g’(x)) and f(t) =1f then 
f'@) =0. 

Newton’s method is based on the plausible notion that if f(a,) is small whilst 
f(a) is big, so that the graph is steep at (a, is (a,)) and close to the x-axis, then the 
graph must cross the x-axis at a point near to a;. Moreover, a better approximation 
to the crossing point is found by following the tangent at (a, f(a )) until it crosses 
the x-axis. Just what “small” and “big” mean in this context has to be made precise. 
Sharp turning of the graph to defeat the crossing of the x-axis is prevented by having 
a bound on the second derivative. This is illustrated in Fig. 5.8. Thus intuitively, the 
success of Newton’s method beginning at a point a; depends on a delicate interplay 
between f(a), f’(a;) and a local bound on f” (x). 

Nevertheless, in cases where f’ and f” do not change sign a relatively simple 
analysis is possible. This is presented in Exercise 4. 

As an example, the equation x? — 2x + 2 = 0 has exactly one real root and it lies 
between —2 and —1. The reader should check that Newton’s method gives rise to 
the iteration 
2a3 —2 
3a2 — 2° 


antl = 
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Fig. 5.8 Newton’s method y 
for the root t of f(x) =0 


™. Tangent steep 


Why we must bound f” 


f (a1) small 


If a; = 0 then a, jumps repeatedly back and forth between 0 and 1. If, on the other 
hand, a; = —2 then a, appears to converge fast, as a2 = —1.8, a3; = —1.769948, 
a4 = —1.769292 and it looks as if we have already reached 3 correct decimal digits. 


5.12.3 Exercises 


1. Let (a,,)°° , be the sequence of Fibonacci numbers, 1, 1, 2,3,5, 8, 13, ... and so on. 
It is known that a,+41/d, — , where ¢ is the Golden Ratio. The convergence is 
quite slow. Apply the heuristic analysis that followed Proposition 5.22 to estimate 
roughly how many further correct decimal digits of @ are obtained with each 
increment of n. Check your estimate against reality by calculating some values 
Of An41/an. 

Hint. Consult Sect.3.3, Exercise 5. 

2. Verify the claim that the Babylonian method results from applying Newton’s 
method to the problem x — c = 0. 

3. Apply Newton’s method to obtain an iteration scheme for the cube root, or, more 
generally, for the r root of c. 

4. There is a simple situation where we can always infer that Newton approximations 
converge to a solution. Suppose that f is defined in an open interval A and that 
f’ and f” are both strictly positive in A. Suppose further that there is a root t of 
f(x) =Oin A. 


(a) Show that f is the only root of f(x) = 0 in A and that f changes sign at ft. 
(b) Let a, be a point in A lying above t. Show that 


f(a) 
f'(ai) 


t<a,— 


<d\. 


(c) Deduce that Newton approximations, beginning at a;, form a decreasing 
sequence that converges to f. 
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5. 


6. 


(d) Suppose that a; < t. What can you say about Newton approximations begin- 
ning at a;? 
Note. Obviously there are variants of this conclusion in which f’ and f” are both nega- 
tive, or have opposite signs. What they have in common is that neither f’ nor f” changes 
sign in A. It is left to the reader to explore them. 


Calculate all the real roots of the equation x° — 20x + 1 = 0 to three correct 
decimal places. 

The calculation of Gauss’s arithmetic-geometric mean (Sect.3.4, Exercise 10) 
provides another nice example of the fast convergence found in Newton approx- 
imations. Let a and b be distinct positive numbers and define sequences by 


a=a, bo =b 
antl = $(dn + bn) > bn+1 = Vandy 


forn = 0, 1, 2, .... The sequences satisfy by < Dn41 < Gn41 <a, forn > 1. The 
limits lim, Gd, and limy_, 5 b, are equal. Their common value is the arithmetic- 
geometric mean of a and b, which we shall denote by M(a, b). 


(a) Let c, = a,/b,. Show that 


1 1 
eu = 5 (Verte). 


(b) Deduce from item (a) that the convergence of c, to 1 is quadratic (that is, 
the error |c, — 1| satisfies an inequality like (5.8)). In fact, show that, given 
5 > O, the inequality 


1 2 
ICn4i — 1] < gté len — II 


holds for all sufficiently large n. 
(c) Deduce that the convergence of a, — b, to 0 is also quadratic. 


Note. The fast convergence of a, and b, to M(a, b) implied by the conclusion of item (c) has 
applications to the computation of so-called elliptic integrals. See the exercises in Sect. 11.2. 


Let A be a closed interval, which may be unbounded or even all of R. Let the 
function f : A — A satisfy the following condition: there exists K, such that 
0< K <1and|f(s) — f(t)| < K|s —t| for all s and ¢ in A. Prove that there 
exists a unique x in A, such that f(x) = x. 

Hint. Let do be any point whatsoever in A and define the iteration a,41 = f (an), 
n= 0, 1, 2, 3, .... Show that for all natural numbers n and p we have 
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n+p-1 
l@ntp — @n| S|a1—ao| >) KA 
j=n 
and apply Cauchy’s principle to show that a, converges. Argue that the limit x is 
in A, that f(x) = x and that this equation can have only one solution in A. Note 
how crucial it is that f maps A into itself and observe carefully why A must be 
a closed interval. 

8. Let f : R > R. Suppose that f is differentiable and that there exists K > 0, such 
that K < 1 and | f’(t)| < K for all t. Show that there exists a unique x, such that 
f(x) =x. 

9. Find an example of a function f : R > R, such that | f(a) — f(b)| < |a — D| for 
all a and b, but the equation f(x) = x has no solution. 


5.12.4 Pointers to Further Study 


— Dynamical systems 
— Numerical analysis 


Chapter 6 ®) 
Integrals and Integration gio 


Any segment of a section of a right angled cone [i.e. a parabola] 
is four-thirds of the triangle which has the same base and equal 
height 


Archimedes. The method of mechanical theorems 


6.1 Two Unlike Problems 


Problem A. To find an antiderivative for a given function. 
If f’(x) = F(x), then we call the function f an antiderivative for F. Now 


d 

— x" = nxt! 

dx 
for each integer n. This tells us that x”*!/(n + 1) is an antiderivative for x” in the cases 
n = 0, 1, £2, +3, .... But what can be an antiderivative for x~!? It makes no sense 
to put n = —1 in this formula. This question greatly exercised the mathematicians 


who invented calculus in the seventeenth century. 

We can ask the more general question: which functions have an antiderivative? 
Our problem is to solve the simplest of all differential equations: given the function 
F to find a function y(x), such that 


dy 
Ix (x) 


Problem B. To calculate the area of a plane figure bounded by a curve. 


Historically this problem was called quadrature, as assigning an area to a plane figure 
meant that a square with the same area was determined. We will not give a general 
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Fig. 6.1 Archimedes’ ge5 


parabolic segment Method of ae 
exhaustion 


Segment = 4x green triangle 


definition of area; that is the subject of measure theory. Nevertheless this will not 
stop us from discussing it, any more than it stopped the mathematicians of antiquity. 

Archimedes calculated the area of a circle and the area of a parabolic segment 
(the figure bounded by a parabola and one of its chords). He gave the formula zr? 
for the circle and showed that 223/71 < a < 22/7. His greatest achievement in the 
computation of area was the parabolic segment, stating that its area was 4/3 times 
the area of a certain inscribed triangle (with base the given chord and top vertex 
at the point on the parabola where the tangent was parallel to the chord). To reach 
this conclusion he had to invent a method, the method of exhaustion, that in its use 
of an infinite sequence of approximations from below resembles modern integration 
theories. He also had to compute the sum of the geometric series be 1/4” (Fig. 6.1). 

Fast forward to the fifteenth century and we find Kepler considering the volume of 
a wine barrel. This is a solid of revolution and the calculation of its volume depends 
on calculating the area of a plane figure. 

Only with the invention of calculus was a method proposed that could be used to 
calculate the areas of general plane figures, starting with the area under the graph 
of a function. In the first place we consider the area between the graph of a positive 
function f and the x-axis, cut off by two vertical lines x = a and x = b (Fig. 6.2). 
This leads to the definition of the Riemann integral or the Darboux integral; two 
different approaches that turn out to be equivalent. We shall call it the Riemann— 
Darboux integral, although in defining it we shall take Darboux’s approach. 

We therefore proceed to Problem B and only later show how it leads to a solution 
to Problem A. 


6.2 Defining the Riemann—Darboux Integral 


Let f : [a,b] — R be a bounded function. Its domain is a bounded and closed 
interval. We do not assume that f is continuous. This is an advantage because it 
is necessary for practical applications to be able to integrate some discontinuous 
functions. But it is essential for the following considerations to make sense that f 
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Fig. 6.2 Problem B. y 
Calculate the area of a plane Quadrature noun. 
figure Calculation of the area bounded by 


or lying under a curve. 
Oxford English Dictionary 


r=a c=b 


should be bounded. This, and the requirement that the domain is a bounded and 
closed interval, are defects of the Riemann integral that were successfully removed 
by the introduction of the Lebesgue integral in the early twentieth century. 

We do not assume that f is positive. However, in the case that f(x) > 0 for all x, 
the integral, when successfully defined, will give an acceptable notion for the area 
bounded by the lines x = a, y = 0, x = b and the graph y = f(x). 


m 


Definition A partition P of the interval [a, b] is a finite sequence (t; “9 (not nec- 


essarily uniformly spaced), such that 
a=t<t<h <-:::-<t,h=b. 
The intervals [t;, t;1] are called the subintervals of the partition. 
For a given partition (f0, f1, f2, ..., tm) we set 


m;= inf f, M;= sup f, j=0,1,..m-1 


[4j.tj+1] [t.t)41] 


and define the lower sum L(f, P) and the upper sum U(f, P) by 


m-—1 m—1 
Lf, P) = omit —t), UGE, P) = Do Mijn - 4). 
j=0 j=0 


It is clear that L(f, P) < U(f, P), since m; < M; for each j. 


Definition A partition P’ is said to be finer than the partition P if every point of P 
is also a point of P’. 


In the next three propositions we assume that f is a bounded function on the 
interval [a, b]. 
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Proposition 6.1 Let P and P’ be partitions of [a, b]. If P' is finer than P then 
L(f, P) SL, P) <u, P') < UC, P). 
Proof Consider how L(f, P) changes if an additional point r is included in the 
partition. Suppose that t; <r < t;41. The only change in L(f, P) that arises is due 
to the replacement of the term m ;(t;41 — t;) by the sum of two terms 
m'(r—tj) +m (ti —1), 


where m', = infi;,-) f and m': = infj,,;,,,) f. But m', > m, and m' > m, (since the 


new infima are taken over smaller sets), so that 
m'(r — tj) +m (tin. — 1) = mj (tii — ty), 


and therefore L(f, P) < L(f, P’). The other inequality is proved by a similar argu- 
ment. 


Proposition 6.2 Let P, and P, be partitions of [a, b]. Then 
L(f, Pi) s Uf, Po). 


Proof Create a new partition P3 by uniting the points in P; and P2 into one sequence. 
Then P3 is finer than P; and also finer than P). This implies that 


L(f, Pi) < LG, Ps) < UF, Ps) s UF, P2), 


so that L(f, Pi) < UCf, P2) as required. 


Consider next all numbers L(f, P), that is, all lower sums, as P ranges over 
all possible partitions. These form a set (we could define it by specification for 
example). This set is moreover bounded above; for example, if we fix a partition P;, 
then L(f, P) < U(f, P:) for every partition P. Similarly the set of all upper sums 
U(f, P) is bounded below. We therefore define the lower and upper integrals 


6 = sup L(y, P), fr ‘= inf U(f, P) 


as the supremum of the lower sums and the infimum of the upper sums respectively, 
taken over all possible partitions. 

If f is a positive function and we wish to assign an area to the region between 
the graph y = f(x) and the x-axis, bounded by the lines x = a and x = b, then it 
seems clear that whatever this area might be, it should lie between the lower and 
upper integrals. 


6.2 Defining the Riemann—Darboux Integral 199 


ie < fi 


Proof Let P; and P be partitions of [a, b]. Then L(f, P,) < U(f, Po). Taking the 
supremum over all partitions P;, we obtain 


Proposition 6.3 


[r = Uf Pa): 


Taking next the infimum over all partitions P:, we obtain f f < f f as required. 


Now we can define the Darboux integral. It has to be said that the process leading to 
this definition is remarkably short. As with the treatment of some previous concepts, 
such as limit or derivative, the definition singles out a class of functions, here called 
integrable, and for each integrable function defines a number called its integral. 


Definition Let the function f be bounded on the interval [a, b]. If the upper and 
lower integrals of f are equal, we say that f is integrable (on the interval [a, b]). If f 
is integrable, the common value of its upper and lower integrals is called the integral 
of f (on the interval [a, b]). It is commonly denoted by one of the following: 


iE a: [is Si [row 


6.2.1 Thoughts on the Definition 


The concept of integral has a reputation for being hard to define. The definition we 
have just given for the Riemann—Darboux integral is actually quite short and some 
of its complexities may be concealed. 

First of all the role of the completeness axiom comes out clearly in the repeated 
use of supremum and infimum. The supremum of the set of lower sums (defining the 
lower integral) is analogous to the supremum of a function. It is not though a function 
that assigns a real number to each real number in its domain, for the domain here 
is not a set of real numbers, but the set of partitions. The notation L(f, P) reflects 
this and emphasises the dependence on P (whilst f remains fixed throughout the 
discussion). 

It appears that the integral is essentially a more complex concept than the deriva- 
tive. Previously the only sets we encountered were sets of real numbers, mainly 
intervals, or sets of natural numbers, and one could quite happily define the deriva- 
tive without using more complex sets. When it comes to the integral, we have to 
embrace the set of all partitions of an interval. A partition is a sequence of real num- 
bers with certain constraints; so the set of all partitions is a set of sequences of real 
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numbers. This is a higher level of complexity than a set of real numbers. It seems 
that every approach to the integral involves complexity at this level. 

Another approach to the integral is possible in which we approximate f from both 
above and below by step functions. We will then encounter sets of step functions. 


Definition A function g : [a,b] > Ris called a step function if there exists a parti- 
tion (fo, t), f2, ..., tm) of [a, b], and numbers (co, C1, C2, ..., Cm—1), Such that g(x) = c; 
fort; <x <tj41, j = 0, 1,2, ...,m — 1. In other words g is constant on each open 
interval ]t;, tj+1[. 


The area under the graph of a positive step function ought by rights to be 
ee cj(tj41 — t;). This suggests that we first define the integral for the step func- 
tion g, whether positive or not, as 


m—1 


S(g) = Do ej(tj41 — t)).- 


j=0 


For a function f, supposed bounded on [a, b], we can define the set of lower 
approximations as the set of all numbers S(g) as g ranges through step functions 
such that g < f (itis here that a set of step functions is needed). This set is not empty 
thanks to the boundedness of f. Similarly the set of upper approximations is the set 
of all numbers S(g) as g ranges through step functions such that g > f. 

So far neither supremum nor infimum has been used. Next, we define the lower 
integral as the supremum of the set of all lower approximations and the upper integral 
as the infimum of the set of all upper approximations. Finally, the function is called 
integrable when the lower and upper integrals coincide. 

The idea of approximating a function from above and below by simpler functions 
for which the integral has an obvious definition is common to many approaches to 
defining integrals. In particular it recurs in the definition of the Lebesgue integral, 
one of the greatest achievements of analysis in the twentieth century, to which the 
Riemann—Darboux integral is but a halfway house, and many of its faults are thereby 
alleviated. 


Exercise Prove that the integral defined using approximation by step functions is 
the same as the Riemann—Darboux integral. 


6.3 First Results on Integrability 


The definition of the Riemann—Darboux integral raises some questions: 


(a) What functions are integrable? More precisely, what conditions can we impose 
on f (in addition to its being bounded) that suffice for f to be integrable? 

(b) Continuous functions on the interval [a, b] are necessarily bounded. Are they 
integrable? 
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(c) Are step functions integrable? If so, and if f is a step function, is f f = S(f) 
(as defined in the last section)? 

(d) If f is integrable can we find a practical way to calculate the integral? It is clearly 
impractical to compute the supremum over all lower sums. 


We shall devote a considerable effort and a large part of this text to answering these 
questions. 

One step is used repeatedly in the proofs and it is useful to set it out in advance. 
Let ¢ > 0. Since the lower integral is the supremum of the lower sums L(f, P) over 
all partitions P, and the upper integral is the infimum of all upper sums U(f, P) over 
all partitions P, there exists a partition P;, such that 


Lif py> [f= 


and another partition P2, such that 


uP) < | fte. 


Now construct a partition P by uniting the points of P; and P2. Then P is simul- 
taneously finer than both P; and P2. Hence in passing from P; and P, to P, the 
lower sum cannot decrease and the upper sum cannot increase. Therefore the above 
inequalities hold also for P in place of P; and P». 

The convenience is that both inequalities hold for the same partition. We can even 
do the same for a finite set of functions. For example, for two functions f and g, and 
a given €, we can find a single partition P, such that the inequalities hold for both f 
and g. 


6.3.1 Riemann’s Condition 


The condition introduced here is basic for proving that given functions are integrable. 


Proposition 6.4 The function f is integrable if and only if the following condition 
(which we shall call Riemann’s condition’) is satisfied: for each ¢ > 0 there exists a 
partition P, such that 

U(f, P) —L(f, P) <6. 


Proof Assume that f is integrable. Then ff = f f. Choose a partition P, such that 


'The name “Riemann’s condition” appears in the book “Mathematical Analysis” by T. Apostol. I 
do not know of any other author who names it after Riemann. It is, however, convenient to have a 
name for it. 
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ft g & 
upp) <ft+s and Lf Py> f £5. 
It follows that 
Tf g & 
up. Py Lippy < f p45 [r+5qe. 


Conversely, assume that Riemann’s condition is satisfied. Let e > 0. Choose a 
partition P, such that U(f, P) — L(f, P) < ¢. Now we have 


LuPs [rs frsuu.r) 


so that [ f — ff < e. But this holds for all ¢ > 0. We conclude that ff = ff. 


The great strength of Riemann’s condition is that we only have to find a single 
partition that satisfies U(f, P) — L(f, P) < «. At this point it is useful to note that 


m—1 


U(f, P) — L(f, P) = 9) Q)(f) (tin — t) 


j=0 


where &2;(f) denotes the oscillation of f on the interval [t;, t;41], that is, the dif- 
ference between the supremum and the infimum (see Sect. 4.5). We recall (Sect. 4.5 
Exercise 4) that the oscillation of f on the interval [c,, c2] is the same as the quantity 


sup | f(x) — fy). 


c1 SX, ySc2 


The supremum here is taken over all pairs of points, x and y, in the interval [c), c2]. 
This formula is very useful for comparing the oscillation of two functions, espe- 
cially when it is required to deduce the integrability of one of them from the known 
integrability of the other, as we shall see. 


6.3.2 Integrability of Continuous Functions and Monotonic 
Functions 


We begin to answer the question as to which functions are integrable. We shall 
show that, loosely paraphrased, continuous functions and monotonic functions are 
integrable. 


Proposition 6.5 Let f : [a,b] — R be continuous. Then f is integrable. 


6.3 First Results on Integrability 203 


Fig. 6.3 Picture of the y 
proof, adapted from 
Newton’s Principia 


A decreasing function is integrable 


The difference U(f, P) — L(f, P) (pale blue) 
is equal in area to the hatched region 

and is made as small as we like 
by refining the partition 


AK 


2 
ao 
8 


Proof Let ¢ > 0. We use the small oscillation theorem (Proposition 4.14; now is 
the time to read it). There exists a partition P, such that M; — mj; <e for each 
subinterval of the partition. But then 


m-—1 m—1 
U(f, P) — L(f, P) = >) (Mj — mji)(tjn1 — 17) < Do (tins — tf) = eb — a) 
j=0 j=0 


and Riemann’s condition is satisfied. 


Newton’s pictorial proof of the integrability of monotonic functions is illustrated 
in Fig. 6.3. 


Proposition 6.6 Let f : [a,b] — R be monotonic. Then f is integrable. 


Proof Assume for example that f is increasing (though not necessarily strictly). If 
f(@ = f(b) then f is constant and obviously integrable; see the next section. So 
we may suppose that f(a) < f(b). 

Let e > 0. Construct a partition P = (fo, t1, ..., tm), such that 


€ 


f(b) — f@ 


for j = 0, 1, 2, ...,m. Since f is increasing we havem; = f(t;) and M; = f(tj+1), 
and we verify Riemann’s condition by the calculation 


tj+1 ti < 


m—1 
U(f, P) — L(f, P) = > (Mj — mi)(tjn1 — t)) 

j=0 
m—1 

= oF tis) — F(t) Gi — t7) 
j=0 

m—1 
< 70 Fe: DFG) - f(t) 


FO) f(@)) =. 


< 
~ fb)-— f@ 
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6.3.3 Two Simple Integrals Computed 


In this short section we shall compute our first integrals. The two results are not very 
impressive, and the treatment of the first function may seem tortuous, but a wait and 
see attitude is required. They will be used to find the integrals of step functions in 
Sect. 6.4. 


Function A. Let f : [a,b] — R where f(x) = 0 fora < x < b but f(a) and f(b) 
are not necessarily 0. Then f is integrable and f f = 0. 
For each ¢ > O we consider the partition P; = (a,a+eé,b—e,b). If f(a) and 
f (b) are positive, then, for all e, we have 
U(f, Pe) = e(f(a) + f(®)). 
If f(a) > 0 => f(b), then, for all ¢, we have 
U(f, P:) = ef (a). 
If f(b) > 0 = f(a), then, for all ¢, we have 
U(f, Pc) = ef (). 
Finally, if neither f(a) nor f(b) is positive, then, for all ¢, we have 


U(f, P-) = 0. 
From these facts it is clear that 
Ei = inf U(f, P) < inf U(/, Pe) =0. 
That is, . f <0. Similar considerations apply to L(f, P) and show that f f > 0. 


Hence f f = i = Oandsowehave f f = 0. The argument is illustrated in Fig. 6.4. 


Fig. 6.4 An upper sum for f(b) 
function A 


a@ ate b—eb 
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Function B. Let g : [a, b] > R be the constant C. Then 1 g=C(b—-a). 
Now U(g, P) = L(g, P) = C(b — a) for every partition and therefore g is inte- 
grable with f g = C(b—a). 


6.4 Basic Integration Rules 


The rules proved in this section enable us to build new integrable functions from the 
old ones. Loosely described, the sum and product of integrable functions are inte- 
grable. Moreover integration is a linear operation in the space of functions integrable 
on a given interval. 

In the preamble to rules and propositions, we shall often write that the functions 
are bounded before assuming that they are integrable. Though logically unnecessary, 
it could be useful to emphasise that Riemann—Darboux integration applies only to 
bounded functions. 


Proposition 6.7 (Sum of functions) Let f : [a,b] > R and g: [a,b] > R be 
bounded functions and assume that they are both integrable. Then f + g is inte- 


grable and 
futo=frefe 


Proof Let P = (to, fi, ..., tm) be a partition of [a, b]. Set 


m= inf (f+g), m= inf f, mj = inf g, 
[tj.tj+41] [tj ,tj+1] [tj ,tj41] 


with similar definitions for M;, Mj, M’ using suprema instead of infima. 


For x in [t;, t;41] we have f(x) + g(x) < M; + M*, so that we find M; < M’ + 
M*;. Similarly m; = m', + m';. These give the inequalities 


U(ft+g,P)<sUF,P)+U(g,P), Lifts, P)= Lf, P)+ L(g, P). 


Let ¢ > 0. There exists a partition P (see the discussion in Sect. 6.3 on this point), 
such that 


u(f.P) < [rte, ug. P) < [e+e 

Lf P)> [f-s. Lig. P)> f[e-e. 
We obtain 
U(f+g,P)-Lif+g,P)<U(f, P)—L(f, P)+ U(g, P) — L(g, P) < 4e. 


This shows that Riemann’s condition holds for f + g. In addition we have 
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[r+ fe-2 < L(f, P) + L(g, P) < L(f +8, P) 

< [irtesutree.Py sus Py+ucry< [r+ [e+e 
so that the inequality 


[t+ fe-28< firtor< [r+ ferre 


holds for all ¢ > 0. We conclude that [(f + g) = f+ fg. 


Proposition 6.8 (Multiplication by scalars) Let f : [a,b] —~ R be bounded and 
integrable. Let a be a real number. Then the function af is integrable on [a, b] and 


forqafs 


Proof For an arbitrary set B we have the equalities 


sup(af) =asup f, inf(af)=ainf f (a > 0) (6.1) 
B B B B 
and 
sup(af) =ainf f, infaf)=asupf (a <0). (6.2) 
B B B B 
Hence 
U(af, P)=aU(f, P), Ltaf, P)=aL(f,P) (a>0) 
and 


U(af, P)=aL(f, P), Ltaf,P)=au(f,P) (a <0). 


In the case a > 0 we therefore have 
sup L(af, P) = asupL(f, P) = a | f= aint U(f, P)= inf U(af, P). 
P P 


The extreme terms are therefore equal. Hence each is the same as [ wf and at the 
same time a ff. 
In the case a < 0 we have 


sup L(af, P) = a inf U(f, P) sof sf =asupL(f, P) = inf U(af, P) 
P P 


with the same conclusion. 


Exercise Prove the formulas (6.1) and (6.2) in the proof of Proposition 6.8. 
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Proposition 6.9 (Join of intervals) Let f : [a,b] — R be bounded and let a < c 
< b. If f is integrable on [a,c] and also on [c, b] then f is integrable on [a, b| and 


hela 


Conversely if f is integrable on [a, b], then f is also integrable on [a,c] and on 
[c, b] and the same equation holds. 


Proof Consider the first assertion. Let f be integrable both on [a, c] and on [c, 
Denote by /; the restriction of f to [a, c] and by f; the restriction of f to [c, b]. 
Let ¢ > 0. Choose partitions P; on [a,c] and P on [c, b], such that 


if 


U(fi, Pi) -—é < f <L(fi, Pi)t+e 


[a,c] 
and 
U(fo, Po) —€ < f < L(f,, Po) +6. 
[c,b] 


Next construct a partition P on [a, b] by uniting P; and Pp). It is clear that 


L(f, P) = L(fi, Pi) + L( fr, Pr) 


and 
U(f, P) =U(fi, Pi) + Ufa, Po). 


But then we get 


U(f.P)-28 < | ft f < Lf, P)+2e. 


[a,c] [c,b] 


This gives U(f, P) — L(f, P) < 4e and Riemann’s condition is satisfied for f on 
[a, b]. This allows us to expand the last inequalities to 


[ rtf pretenses! rsvur<f ref ptr 
[a,c] [c,5] [a,b] [a,c] 


[c,b] 


which are valid for all e > 0. The first claim of the proposition now follows. 

For the second assertion we must show that f; and f> are integrable given that 
f is integrable. Let ¢ > 0. We consider a partition P of [a, b], which contains the 
point c and satisfies U(f, P) — L(f, P) < ¢. From P we make in an obvious way 
partitions P; of [a,c] and P» of [c, b] which satisfy U(f{, Pi) — L(fi, Pi) < ¢ and 
U(f2, Po) — L(fa, Pa) < €. 
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6.4.1 Integration of Step Functions 


Let f : [a,b] — R be a step function. There is a partition (fo, f, fo, ..., tm) of 
[a,b], and numbers (co, Cj, C2, ..., Cm—1), Such that f(x) = cj; for tj <x < tj44, 
j =0,1,2,...,m—1. 

Consider the restriction of f to the interval [t;, t;,,]. This is a constant c;, plus 
a function that is 0 in the open interval ]t;, t;,;[, though not necessarily 0 at its 
endpoints. 

We conclude, by Proposition6.9 and the two simple integrals calculated in 
Sect.6.3, that f is integrable on each subinterval of the partition, and hence also 
on [a, b], and moreover 


m-1 


that m-1 
yy f= Do eiltj —t)). 
tj j=0 


j=0 


b 
f= 
a 


The Riemann—Darboux integral gives the “right answer” for the integral of a step 
function. Note that the values taken by f at the points of the partition do not influence 
the outcome. 


6.4.2 The Integral from a to b 


Up to now the integral has been defined over the set [a, b]. A new twist introduces 
integrals over directed intervals; the integral from a to b, or the integral with lower 
limit a and upper limit b. The terminology is not supposed to imply that a < b; 
indeed, we could have b < a or a = b. The use of the term “limit” is customary 
here. 


Definition Let A be a closed and bounded interval and f : A — Ra bounded func- 
tion that is integrable on A. Let a and b be points in A. We define 


b 
/ f= f if a<b, 
a [a,b] 


b 
; f=—] Ff i wed, 
a [b,a] 


and 


b 
/ f=0 if a=b. 


Proposition 6.10 Leta, b and c be points of A in any order. Then 
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frafrr fs 


Proof One can consider all six possibilities for the ordering of a, b and c and use 
Proposition 6.9 on the join of intervals. It all works out and the reader is invited to 
check it. 


6.4.3 Leibniz’s Notation for Integrals 


Leibniz denoted the integral ip : f by vi : f(x) dx. We often use this notation; it 
has many advantages comparable to the advantages of using Leibniz’s notation for 
derivatives. 

As an example we can write 


i, Vsinx dx 
0 


to denote i f where f is the function f(x) = /sin x. This could then be read as 


The integral of the square root of the sine of x with respect to x from 0 to 7. 


But I am sure most English-speaking mathematicians read it according to the fol- 
lowing phonetics: 


The integral of the square root of sine ex dee ex from nought [zero in US] to pie. 


In differential geometry expressions like f(x) dx can be precisely defined and 
are called differential forms. It is differential forms that are integrated, rather than 
functions. But that is a whole new topic beyond fundamental analysis. In this text the 
expression “dx” has no independent meaning, other than indicating how the integral 
should be understood. Consider for example the two integrals 


1 1 
/ Vx? + a4 dx, i Vx? +a‘ da. 
0 0 


Here, two unlike functions are to be integrated. In the first place f(x) = Vx? + a+ 
where a is a constant; in the second place g(a) = / x2 + a* where x is a constant. 


6.4.4 Useful Estimates 


The reader who has studied vector spaces may recognise that the two integration 
rules, Propositions 6.7 and 6.8, assert that the set of all functions integrable on the 
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interval [a, b] constitute a vector space over the field R and that the integral is a linear 
functional. 

Now functions also possess an ordering, where f < g means that f(x) < g(x) 
for all x in the common domain. We shall see that integration is not only linear, 
respecting addition and scalar multiplication, but it is a positive linear functional, by 
which we mean that it respects the ordering. 


Proposition 6.11 Let f : [a,b] > R be bounded and integrable, and assume that 
f(x) = 0 for all x € [a, b]. Then 
b 
/ f = 0; 


Proof It is obvious that L(f, P) = 0 for every partition P. Hence also fe f = 9. 


Proposition 6.12 Let f : [a,b] > R, g: [a,b] — R be bounded and integrable, 
and suppose that f (x) < g(x) forall x € [a,b]. Then 


[ref 


Proof Because Le — f) => 0 and it equals fg - ibs he 


Proposition 6.13 Let f : [a,b] — R be bounded and integrable. Suppose that 
m< f(x)<M 


for all x in [a, b]. Then 
b 
mb-ays f fsMb—a), 
Proof Integrating the inequalities m < f(x) < M we find 


b 
mb-ayef f<Mb-a). 


The inequalities appearing in the next two propositions are immensely important. 


Proposition 6.14 Let f : [a,b] — R be bounded and integrable. Then the function 
| f | is also integrable and 
b 
=f iri 
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Proof The proof that | f| is integrable is an exercise (see below). Now f < |f| and 
—f <|f|on [a, 5], so that 


frsfiui and -frs Piel 


ei, 


Proposition 6.15 (Mean value theorem for integrals) Let f : [a,b] ~ R be con- 
tinuous and let g : [a,b] > R be bounded, integrable and non-negative. Then there 


exists € in [a, b], such that 
b b 
[ #e-1@ [ «. 


Proof Let M = maxja, f and m = minya,,) f. Then, since g(x) > 0, we have 


One of the left-hand sides is equal to 


mg(x) < f(x)g(x) < Mg(x) 


for all x, and hence 


b b b b b 
mf g=fomes[ ses f me=m fe. 
a a a a a 


The number qe fg lies between the minimum and the maximum of f(x) J. gon 
[a, b]. Since f is continuous, we can apply the intermediate value theorem and 
conclude that there exists & in the interval [a, b], such that 


[re=s0 fs. 


6.4.5 Exercises 


1. Show that if f and g are bounded functions, both integrable on the interval [a, b], 

then so is fg. 
Hint. First show that the square of an integrable function is integrable. Then 
use the identity 4fg = (f +g)? — (f — g)’. To show that f? is integrable, 
given that f is bounded and integrable, you may want to compare the oscillation 
Qe F?) with ley er] (F)- 

2. In the mean value theorem for integrals (Proposition 6.15) it was assumed that 
the function g was non-negative. Obviously the same result holds if it is assumed 
instead that g is non-positive. Show, however, by means of an example, that the 
conclusion may not hold if g takes both positive and negative values. 
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3. 


10. 
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Let f be bounded on [a, b]. Show that for any subinterval [c, cz] of [a, b] we 
have 


Qe fl) = he co) (f)- 


Deduce that if f is integrable then so is | /|. 


. Suppose that f is integrable on the interval [a, b] and that there exists a > 0 


such that f(x) > q@ for all x in [a, b]. Show that ./f is integrable on [a, b]. 
Hint. This is similar to the cases | f| and f*. One can compare the oscillation 
of ./f on an interval with that of f. Try using the identity 


_ 47 
ve = ae day 


See also Exercise 12 below. 


. Suppose that f is bounded and integrable on the interval [a, b]. Suppose that 


g is obtained from f by changing the values of f at a finite number of points. 
Show that g is integrable and that fg = [ f. 


. Suppose that f is bounded on [a, b], and continuous except at a finite number 


of points. Show that f is integrable. 
Hint. By using the join of intervals one may assume that f is continuous except 
at a. 


. Suppose that f is bounded on [a, b] and for each h > 0 it is integrable on 


[a +h, b]. Show that f is integrable on [a, b]. 


. Let f be defined on all of R. For each real number c, we define the translated 


function f. by f.(x) = f(x — c). Suppose that f is integrable on all bounded 
intervals. Show that 
b b+e 
i wee 
a a+e 


for all a and b. 


. Let f be integrable on the interval [—L, L] and define 


ra= fs, (fy Sk). 
0 


Show that if f is an even function then F is an odd function, whereas if f 
is an odd function then F is an even function. This results in the often useful 


observation that Z 
ee 
=L 
if f is an odd function. 
() Prove Hélder’s inequality for integrals. Let f and g be positive functions, 


integrable on the interval [a, b]. Let p and g be positive numbers that satisfy 
(1/p) + (1/q) = 1. Then 
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11. 


12. 


13. 


14. 


[ = ([’ yy ([ “) 


Hint. See the hint for Hélder’s inequality for series, Sect.5.11 Exercise 2. 

A function can be integrable though it is discontinuous at infinitely many points. 
Consider the function f with domain [0, 1] defined as follows. Firstly we set 
f(O) = fC) = O(these values do not of course matter). Secondly, forO < x < | 
and x irrational we set f(x) = 0. Thirdly for 0 < x < 1 and x rational we write 
x = a/b, where a and b are positive integers with highest common factor 1, and 
set f(x) = 1/b. Show that f is integrable. 

Hint. Obviously L(f, P) = 0 for any partition P. The thing is to show that 
U(f, P) can be made as small as we like by choosing an appropriate partition. 
The proofs that f? and | f| were integrable (Exercises 1 and 3) depended on the 
Lipschitz continuity (see Sect. 4.5 Exercise | for the definition) of the functions 
x* and |x|, for the former function on a bounded interval [—M, M]. The same 
was true of ./f given that f > a > 0 for some constant a (Exercise 4). This 
procedure will not work to prove that ./f is integrable given only that f is inte- 
grable on [a, b] and f > 0. The problem is that ./x is not Lipschitz continuous 
on the interval [0, M] (where M = sup f). To prove integrability of ./f, we 
need a more powerful approach. We frame a general proposition and invite the 
reader to prove it. It includes probably all cases of Riemann integrability that 
are met with in practice. 


Let f be integrable on [a, b], let M = sup | f| and let g be continuous on 
[—M, M]. Then g o f is integrable on [a, b]. 


The following steps are suggested: 


(a) Show that for every ¢ > 0 there exists 6 > 0, such that for every interval 
[e1, c2] C [a, B], if Qhe, (Ff) < 6 then Qhe,,1(g 0 f) < &. 
Hint. Use uniform continuity of g on [—M, M] (Proposition 4.15). 

(b) Let K = sup|go f|. Lete > 0. Let 5 correspond to « as in part (a). Choose 
a partition P, such that U(f, P) — L(f, P) < ¢6. Show that 


U(go f, P)—Li(go f, P)< 2K +b—a)e 


and deduce that g o f is integrable. 


Use the result of the previous exercise to show that if f is integrable and non- 
negative, then ./f is integrable. 

After all the preceding exercises, it is useful to have an example of a bounded 
function that is not integrable. An oft quoted one is the function f on the interval 
[0, 1] defined by 

if x is irrational 

if x is rational. 


foy={t 
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Show that ff = 1 and f f = 0. 
Note. This is a standard example of a function that is not Riemann integrable, but is Lebesgue 


integrable. 


15. Let f and g be integrable on the interval [a,b]. Show that the functions 
min(f, g) and max(f, g) are integrable. In particular f, := max(f,0) and 
f- := max(— f, 0), called the positive and negative parts of f, are integrable. 
Note. These functions are defined pointwise: min(f, g)(x) = min(f (x), g(x)). 


16. Find an example of a function f such that | f| is integrable but f is not. 


6.5 The Connection Between Integration and 
Differentiation 


In this section we show that in some sense integration and differentiation are oper- 
ations inverse to each other. This finds its expression in the fundamental theorem of 
calculus. It means that integration can be used to solve the problem of antiderivatives 
(Problem A) and conversely antiderivatives can be used to compute integrals. More 
generally integrals can be used to solve differential equations. 

An important role is played by the inequality 


[sr < fir 


which holds if a < b. One has to be careful; if a > b the correct inequality is 


[is < fifi 


Proposition 6.16 Let f : [a,b] — R be bounded and integrable. Let 


roy = [ f 


for all x in [a, b]. Then the following hold: 


(1) F : [a,b] > R is continuous. 
(2) Ifa <x9 <b and f is continuous at xo, then F is differentiable at x9 and 


F' (xo) = f (Xo). 


Proof (1) Let K = sup|f|. Suppose first that a < x < b and consider 
limp+04 F(x +h). For h > 0 we have 


x+h x x+h 
rati—Fa=f s-for=fo os 
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xth x+h 
i f <[ fl < Kh. 


We conclude that limy,_.94 F(x +h) = F(x). Similarly limp.9_- F(x +h) = F(x) 
for a < x < b. This shows that F is continuous. 
(2) Let f be continuous at x9. We will show that both left and right limits of the 
difference quotient of F at xo are equal to f (x9). 

Let e > 0. There exists 6 > 0, such that | f(x) — f(xo)| < ¢ whenever |x — xo| < 
6. For the right limit of the difference quotient, we let 0 < h < 6 and consider 


|F(x +h) — F(x)| = 


F (xo + h) — F (x0) 
h 


f (x0) 


Xoth 
al f(t) dt — f (xo) 
fl eh 1 roth 
;/ pyar f f (Xo) dt 


h x0 x0 


Xoth 
as / (f(t) = fo) dt. 


We know that | f(t) — f(xo0)| < eifxo < t < x9 +h. Hence for0 < h < 6 we have 


F (xo +h) — F(x0) 
h 


1 Xxot+h 1 
Ff (xo) <;/ If) — fOo)|dt < peh =e 


and conclude that 
_ F(X +h) — F(x) 
lim = f (Xo). 


h->0+ h 


Next we consider the left limit. Whatever the sign of h, we always have 


Z ia RF 1 Xoth 
oF poo = 7 f= seondt, 


but for i < 0 the estimate of the integral is trickier since x9 +h < xo. For —6 < 
h < 0 we have 


1 
< —s|h| =e 


1 xot+h 
Ff (f(1) — f(xo)) dt 


1% 
= -;/ (f() — f(xo)) dt 
Xoth 


h h Jy, ~ |h| 
and find that F( h) — F(x) 
: xo +h) — F(x 
1 = : 
jum A f (Xo) 


Putting together the left and right limits we conclude that F’(x9) = f (xo). 
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Here are some further, often used, rules. They are simple consequences of Propo- 
sition 6.16 and the definition of integral with limits. The reader should supply the 
proofs. 


Gg) If Ga) = ie f for all x in [a,b], xo is in the open interval Ja, b[ and f is 
continuous at xo, then G’(xo) = —f (x0). 

Gi) Ifa<c<b, G(x) = i f for all x in [a, b], xo is in the open interval Ja, b[ 
and f is continuous at xo, then G’(xo) = f (xo). 


Proposition 6.17 (The fundamental theorem of calculus) Let f : [a,b] ~ R be 
bounded in [a,b] and continuous in ]a, b[. Suppose that there exists a function 
g : [a,b] > R, continuous in [a, b] and differentiable in \a, b[, such that f = g’ in 
Ja, bl. Then 


b 
i f = g(b) — g(a). 


Proof Set 
Fo) = | f, @sx <b). 


Then F is differentiable with F’(x) = f(x) for a <x <b, continuous for 
a<x <b, and F(a) =0. But f(x) = g/(x) for a < x < b. Hence (F — g)’ =0 
and F — g is therefore constant in Ja, b[. Now F and g are continuous in [a, b] so 
that we can pass to the endpoints and deduce 


F(b) — g(b) = F(@) — g@), 


that is, F(b) = g(b) — g(a). 


When the fundamental theorem is used to calculate an integral, itis usually applied 
in the following way. The given function f (the integrand) is continuous in an open 
interval A, possibly unbounded. An antiderivative g is known or found; that is g is 
defined in A and g’ = f. Then for all a and b in A (their order does not matter) we 
have, in the conventional notation: 


b 
/ f =8o|, = 8b) — ga). 


Rephrasing this slightly, let g be differentiable in the open interval A and let g’ 
be continuous in A. Let a be a point of A. Then for all x in A we have 


il g’ = g(x) — g(a). 


This brings out strongly the extent to which integration and differentiation are 
inverse operations. But some emphasis falls on the requirement that g’ should be 
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continuous. If we drop this requirement it may happen that g’ is not even integrable 
on the interval [a, x]. This may be regarded as a defect of the Riemann—Darboux 
integral, though opinions are divided on this point. 

The fundamental theorem shows how an antiderivative, when known, can help us 
calculate an integral. The converse of this is that integrals show that antiderivatives 
exist. We can solve the simplest of all differential equations, dy/dx = f(x), using 
an integral. The proof of the following proposition should now be obvious. 


Proposition 6.18 Let f be continuous in the open interval A. We have the following 
conclusions: 


(1) The function f has an antiderivative, that is, there exists afunction g with domain 
A such that g’ = f. 

(2) If g is one antiderivative, then the most general antiderivative is g + C where 
C is a constant. 

(3) Leta € A. The function F(x) = { Ff, (& € A), is an antiderivative for f. 


The solution of the differential equation dy/dx = f(x) is often written as 


y(x) = [roarse 


The formal integral here, written in Leibniz’s notation but without the limits, is called 
an indefinite integral. Itrepresents any antiderivative of f . Finding an antiderivative is 
often called solving the integral {' f (x) dx, especially when it is achieved by applying 
a set of techniques described later (in Chap. 8). In contrast to this, an integral with 
limits is sometimes called a definite integral. 


6.5.1 Thoughts About the Fundamental Theorem 


The fundamental theorem asserts that the formula f : f = F(b) — F(a) holds, when 
f is continuous and F is an antiderivative of f. This is the form of the fundamental 
theorem as usually presented in analysis texts. However, the requirement that f is 
continuous is restrictive since many discontinuous functions are integrable. 

In applications (for example, Fourier series, technology and engineering) discon- 
tinuous integrands arise frequently and it is therefore useful to extend the fundamen- 
tal theorem to a larger class of integrands. Ideally we would like to extend it to all 
integrable functions. 

It is perhaps unhelpful to overemphasise the role of antiderivative as a central 
concept in integration theory. It could be preferable to introduce instead a notion of 
primitive function, distinct from that of antiderivative. A small warning: “primitive 
function” is often used as a synonym for “antiderivative”’, but in this text a different 
usage is proposed. 
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Definition Given an integrable function f on [a, b], by a primitive of f is meant 
any function that differs from the function 


F(x) =} f 


We immediately extend this definition to the case of a function defined on an 
interval A, which could be unbounded. Useful cases, to be considered later, are 
A = [0, o[ and A = ]—oo, oo[. Suppose that f is integrable on every bounded 
subinterval [a, b] of A. We shall say that a function g with domain A is a primitive 
of f, if there exists a in A, such that the function 


ay | f 


Exercise We can fix the point a in the definition of primitive in advance. Let c € A. 
Show that g is a primitive of f if and only if the function g(x) — f * f is a constant. 


by a constant. 


is a constant. 


The fundamental theorem states that if f is continuous in an interval A, then any 
function g, continuous in A, and differentiable and satisfying g’(x) = f(x) at all 
the interior points of A, is a primitive of f. This opens up the question of how to 
find primitives for more general types of integrable function. An ideal version of the 
fundamental theorem would enable us to identify the primitives of a given integrable 
function quite generally. 

We shall describe a class of functions that often arise in practical applications and 
identify their primitive functions, thereby extending the fundamental theorem. 


Definition A function f : [a,b] — R is said to be piece-wise continuous if there 
exists a partition a = ty < t) <--- < t, = b, such that for each k, the restriction of 
f to the open interval ]f;,, t,41[ is continuous, and extends to a continuous function 
in the closed interval [t, t¢+1]. 


Putting it differently a piece-wise continuous function f has only a finite number 
of discontinuities and at each one, the left and right limits exist (though only the right 
limit at a and only the left limit b). The values of f at the points t; are unimportant. 

We extend the definition of piece-wise continuous function to unbounded domains. 
A function f defined in an interval A (possibly unbounded) is said to be piece-wise 
continuous if its restriction to each bounded interval [a, b], included in A, is piece- 
wise continuous in the sense of the previous paragraph. A piece-wise continuous 
function in an unbounded interval can have infinitely many points of discontinuity, 
but there are only finitely many in each bounded closed interval. An example of such 
a function of some importance in technology is the infinite square wave. 

We next extend the fundamental theorem and thereby identify the primitives of 
all piece-wise continuous functions. 
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Fig. 6.5 An infinite square y 
wave and one of its ; ; 
primitives “--- t a ee ---> 
——.— —————— 2 
An infinite square wave And one of its primitives 
y 7 
<+--- --- 
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Proposition 6.19 Let f : [a,b] > R be piece-wise continuous. Let F : [a,b] > R 
be continuous in [a, b], differentiable except at a finite set of points (possibly empty), 
and satisfy F'(x) = f (x) for each x at which F is differentiable. Then 


b 
i f = F(b) — F(a). 


In other words the function F is a primitive of f. 


Proof The points to < t) < +--+ < t,, at which F is not differentiable, form a parti- 
tion of [a, b] and we may obviously include the endpoints, so that t9 = a and t,, = b. 
Moreover these points include all discontinuities of f (by, for example, Sect. 5.8 
Exercise 10). Now we have, by the fundamental theorem: 


m—1 m-1 


b th 
i aol f= >> (Flt) — F@)) = Fb) — FQ). 
a k=0 % ' k—0 


Several rules of integral calculus can be extended by using primitives instead of 
antiderivatives as we shall see. 


6.5.2 Exercises 


1. Let f be a continuous function on an open interval A. We have seen that for 
every choice of a in A, the function F(x) = J : f is an antiderivative for f. It 
can happen that f has antiderivatives that cannot be expressed in this form. Find 
an example of this. 
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2. Suppose that f is bounded on [a, b] and for each ¢ > 0, it is integrable on 
[a + ¢, b]. Show that f is integrable on [a, b] and lim. os f”,, f =f. f. 

3. Give an example to show that Proposition 6.19 does not hold if the requirement 
that F is continuous is omitted. 

4. Let f(x) = x? sin(x~?) for x #0 and f(0) = 0. Show that f is everywhere 
differentiable, but that f’ is unbounded in any interval containing 0. Thus the 
integral a f' does not exist (at least not as a Riemann—Darboux integral). 

5. In this set of exercises we study periodic functions and their primitives. Recall 
(Sect. 4.2 Exercise 20) that a function f is periodic if there exists T 4 0 (a period) 
such that f(x + T) = f(x) for all x. Let f be a periodic function with funda- 
mental period T (that is, T is the lowest positive period, supposing that one such 
exists). Suppose that f is bounded and integrable on the interval [0, T]. 


(a) Show that f is integrable on any bounded interval. 
(b) Show that for any interval [a, b], such that b — a = T we have 


fr-fs 


(c) Show that 


(d) The quantity (1/7) ie f appearing in item (c) is called the mean of the 
periodic function f. Show that there exists a unique constant C such that 
f + C has mean 0. 

(e) Let F(x) = in f. Show that F is periodic if and only if f has mean zero. 

(f) Suppose that f has mean zero and set Fo = f. Show that one may define 
uniquely a sequence of functions (F;,)7°.9, such that each function is periodic 
with period T, each has mean zero, F; is a primitive of Fo, and F/ = F,_; 
for n = 2,3,... (so, in fact, F, is a primitive of F,_, forn > 1). 


6.6 () Riemann Sums 


We have defined the integral of a bounded function f on an interval [a, b] as the 
supremum over all lower sums or the infimum over all upper sums, provided these 
two happen to be the same. This was not Riemann’s definition; actually it is due to 
Darboux. However, it turns out that Riemann’s integral and Darboux’s are the same; 
the same functions are integrable and the integral has the same value when it exists. 
This is reflected in our choice of name: the Riemann—Darboux integral. In this nugget 
we are going to consider how Riemann defined his integral, and show that the result 
is equivalent to that of Darboux. 
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Riemann approached the integral through what we call Riemann sums. Given a 
bounded function f on a closed interval [a, b] the Riemann sums are defined in the 
following way. Let P be a partition of the interval [a, b], let us say it is the sequence 
P = (60, t), ..., tm). For each j we choose quite arbitrarily a point r; in [t;, tj41], for 
each j = 0, 1, ...m — 1. Now we form the sum 


m—1 


S= So Fr) (ti41 — tj). 


j=0 


A sum formed in this way is a Riemann sum for the function f, corresponding to 
the partition P and the choice of points r;. In an obvious way it is a candidate for an 
approximation to the area under the graph y = f(x) (in the case when f is positive) 
that is just as plausible as an upper or lower sum, if not more so. 

The quantity maxo< j<m—1(tj+1 — t;) 1s called the mesh size of the partition P (the 
analogy is with a fishing net that lets fish below a certain size escape; how appropriate 
this is remains moot). Riemann defined the integral as a kind of limit, when it exists. 
More precisely it is anumber A that has the following property: for each ¢ > 0 there 
exists 6 > 0, such that 


m—-1 
lA — SOF (tj - 4) | <e 
j=0 
for all partitions P = (fo, ty, ..., 4) with mesh size less than 6, and for all possible 


choices of the points r; in the intervals [t;, t)+1]. 


Proposition 6.20 The Riemann integral of a bounded function on an interval (a, b] 
exists if and only if its Darboux integral exists. When the integrals exist, they are 
equal. 


Proof The proof of this is long and extends to the end of this subsection. We take 
the shorter part first. 

We assume that the integral of f on [a, b] exists according to Riemann’s definition, 
and that the integral has the value A. We wish to show that the Darboux integral exists 
and also has the value A. 

Let e¢ > 0. Choose 5 > 0, such that 


1 


ja- fT PCja1 — tj) | <eé 


j=0 


for all partitions P = (fo, t, ..., 47) with mesh size less than 6, and for all possible 
choices of the points r; in [f;, tj+1]. 
Consider one such partition P and let 
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m;= inf f and M;= sup f. 


[tj tj] [tj ,tj41] 


For each subinterval we can choose a point r;, such that f(r;) < mj; + €. Then we 
find 


m-—1 m-—1 


Yo Foti — 47) < DO ny + Gj — tj) = LF, P) + eb - a) 


j=0 j=0 


so that 
A<L(f, P)+e(b-—a+1). 


In a similar way we can obtain 
A> U(f, P)-—e(b-—atl). 
But from this it follows that 
A-—e(b-—a+1)<L(f, P)<U(f,P)<At+eb—a+l1). 


This holds for all ¢ > 0, so that we conclude that f is integrable in the sense of 
Darboux and the integral equals A. This concludes the first, and shorter, part of the 
proof. 

The proof that if f is Darboux-integrable, it is also Riemann-integrable, is more 
complicated. We begin with some general considerations before turning to the actual 
proof. 

Let K be aconstant, such that | f(x)| < K for all x in [a, b] (recall that we are 
assuming that f is bounded). Suppose 5 > O and consider a partition with mesh size 
less than 6. We wish to give an upper estimate for how much the lower sum L(f, P) 
and the upper sum U(f, P) change, if P is replaced by a new partition P’, which is 
formed by adding p new points to P. 

The new points land in at most p subintervals of P. First we will estimate the 
contribution of these intervals to the lower and upper sums, before the inclusion of 
the new points. 

The absolute value of the contribution is less than pK 64, for there are at most p 
intervals in question, each has length at most 5, and m; and M; have absolute value 
less than K. 

Next we estimate the contribution to the lower and upper sums of these same 
intervals after the insertion of the new points. The intervals in question get replaced 
by at most 2:= new intervals (how many they are will depend on how the new points 
fall), and their contribution to L(f, P’) and U(f, P’) has absolute value at most 
2pK6. 
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The conclusion therefore is that 
IL(f, P) — L(f, P’)| < 3pK6, |U(f, P)-— UCf, P’)| < 3pK6. 


Now we can prove that a Darboux-integrable function is Riemann-integrable. 
Suppose that f is Darboux-integrable and that | f| < K.Lete > 0. Choose a partition 
P,, such that 


UR POS < ff < LRP +5. 


Suppose that P; has p points. Let 6 = e/6pK. Observe that 6 depends only on é and 
f (the somewhat arbitrary choice of p depended only on ¢ and f/f). 

Let P be a partition with mesh size less than 6. Let P’ be the partition that is 
formed by uniting P and P;. Then P’ is finer than P; so that 


U(f, P’) — 5 < fs < L(f, P’) + - 


But P’ is obtained by adding p points to P, and the latter has mesh size less than 6, 
so that we find 


and combining this with the previous inequalities we obtain 


& 


U(f.P)—3pK5—5 < | f <L(f.P)+3pK8-+5 


or, recalling the definition of 6: 


upP)—e< ff <LULP)+e, 


To summarise, this holds on the sole premise that the mesh size of P is less than 
6. Furthermore a Riemann sum os Sf (rj) @j41 — t;) for the partition P, formed 
by taking points r; in the intervals [t;, t;,1], lies between L(f, P) and U(f, P). We 


ZT 9 
conclude that 
m—-1 


Yo fC -— t) - / is 


j=0 


<€ 


whenever the mesh size of P is less than 6; that is, the Darboux-integral { f is also 
the Riemann-integral. 
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6.6.1 Things You Can Do with Riemann Sums 


Riemann sums are highly versatile tools with both practical and theoretical uses. We 
give one example of each. 


Approximate an Integral 


A Riemann sum is a simple approximation to the corresponding integral. Better 
approximation methods have been developed but they are largely refinements of the 
Riemann sum. 

We can estimate the error if we know a little more about the function f. Suppose 
that f is differentiable and that | f’(x)| < M for all x in [a, b]. Then, by the mean 
value theorem, we have 


If (x) — f(rj)| < M|x —rj| < M(tj41 — 1) 
for all x in the interval [7;, tj4,]. Hence 


m-—1 


[r- Yo fe )Gin — t)) | = 
j=0 


m—1 that 
bf) 
sy ( r;)) 


m—-1 

2 

< So M(tj41 oe tj) : 
j=0 


To see what this means, let us suppose the partition P divides the interval [a, b] 
into m equal intervals. Then the above error bound is 


M(b— ay 
a 


A natural way to implement such an approximation is to double the number of 
partition points in each step. After n steps the error is bounded by 


M(b — a)’ 
Qn 


This means that, at worst, each doubling contributes roughly 0.3 (approximately 
log, 2) further correct decimal places. Plainly room for improvement! 


Prove a Refinement of the Fundamental Theorem 


Proposition 6.21 Let f : [a,b] — Rbecontinuous in [a, b], differentiable in Ja, b{, 
and suppose that f' is bounded in ]a, b{ and integrable on [a, b]. Then 


[F=10- Fo. 


Proof For each partition a = fo < t) < +++ < t, = b we have, by the mean value 
theorem, 
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m—1 m—1 


fO—)-f@= > (Fler) — £)) = D5 far — &) 


k=0 k=0 


for certain numbers s; in [t;, th41],k = 0, 1, ...m — 1.Onthe right we have a Riemann 
sum for ff’, and it tends to { f’ as the mesh size tends to 0. We conclude that the 
integral is equal to f(b) — f(a). 


The proposition says that given an integrable function g, a function f, continuous 
in [a, b] and satisfying f’(x) = g(x) at every point of Ja, D[, is a primitive (in the 
sense of Sect. 6.5) of g. A nice result, but not nearly as practical as it might appear, as 
the requirement that f’(x) = g(x) at every point of the open interval means that g, 
if discontinuous, cannot have jump discontinuities. In this connection see Sect. 5.8 
Exercise 10. 


6.6.2 Exercises 


1. Compute the following limits by interpreting them as Riemann sums: 
: 1 n 
®: pe 
_ lo, 
Oy te 


yoke where p > 0. 
k=l 


ban ee oo 
(d) pas > n2 + k2 


(c) 


im 
n>oo nPtl 


(e) lim 5 ! 
noo Ss /n2 + ke 


6.6.3 Pointers to Further Study 


— Lebesgue integral 
— Numerical integration 
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6.7. () The Arc Length, Volume and Surface of Revolution 
Integrals 


Much of the motivation for the definition of the integral was derived from computing 
the area of a particular kind of plane figure, and a fairly simple one at that, bounded 
on three sides by segments of the straight lines, y = 0, x = a and x = J, the fourth 
side being a part of the graph y = f(x). One can easily extend this to the area 
between two graphs y = f(x) and y = g(x) between x = a and x = b, assuming 
that f(x) < g(x) in the interval Ja, b[. The area is then Le — f). To go beyond 
this in the calculation of area, we need to give a general definition of area of a plane 
figure. This is not so simple and requires a study of the topological properties of the 
plane. 

The notion of arc length is by nature simpler than that of area, being essentially 
one-dimensional. We can give a treatment of the length of a reasonably well-behaved 
curve using approximations analogous to Riemann sums. 

First let us look at a graph. Given a function f with domain [a, b] we obtain a 
“curve”, the graph y = f(x). We can approximate what should turn out to be its 
length by taking a partition P = (0, t1, ..., tn) of [a, b] and writing down the sum 


m—1 


S(P) = 0 Ven — te)? + Ff Gee) — FG). 


k=0 


Geometrically S(P) is the length of a polygonal curve inscribed in the curve y = 
f (x). Now we can define the arc length in imitation of Riemann’s definition of the 
integral, as the number L that has the following property, if such a number should 
exist: for all ¢ > 0 there exists 6 > 0, such that for all partitions of [a, b] with mesh- 
size less than 6 we have 

|L — S(P)| <e. 


This construction is illustrated in Fig. 6.6. 
The most important case for applications is when the curve has a continuously 
varying tangent. Then we obtain the arc length integral. 


Fig. 6.6 Approximating arc 
length 
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Proposition 6.22 Let f, with domain [a, b], be continuous in [a, b] and differen- 
tiable in a, b[, and assume that f' extends to a continuous function in [a, b]. Then 
the length of the curve y = f (x) is given by the arc length integral 


b 
L=]| V1+f'@)?dx. 


Proof For a given partition P, we can use the mean value theorem to find numbers 
rx in [t,, te+1] (for each k) so that we can write 


m—1 
S(P) = 0 Ven — 4)? + Fe) — FOP? 
k=0 


m—1 


=o V+ Fr)? ess — te). 
k=0 


We have here a Riemann sum for the arc length integral so the result follows once 
we show that the integrand is integrable (see the nugget on Riemann sums). The 
integrability was covered in Sect. 6.4 Exercise 4. 


The formula extends easily to the case of a curve that may have corners; more 
precisely to the case when f is continuous but is only piece-wise continuously 
differentiable. By the latter we mean that there is a partition (so, 51, ..., 5m) of [a, b], 
such that the derivative exists except at the partition points s,, and for each open 
interval ]s;,, 5%+1[ the derivative extends continuously to the closed interval [s,, 5,41]. 
This includes the case of polygons and most curves that arise in practical applications. 


6.7.1 Length of Parametric Curves 


A plane parametric curve, expressed by x = f(t), y = g(t), where a < t < b, can 
cross itself, double back along itself, or worse. As a geometric object it is tricky to 
define its length. However we can compute the distance travelled as t goes from a to 
b. This has obvious practical applications. 

For each partition P = (fo, ty, ..., tm) of [a, b], we can approximate the distance 
travelled by 


m-1 


d(P) =) V (fF eas) — F(t)? + (8(tes1) — 8(t))?. 


k=0 


Then we can say that the distance D is a number that has the following property, if 
such a number exists: for all ¢ > 0 there exists 6 > 0, such that for all partitions P 
with mesh-size less than 5 we have 
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Fig. 6.7 Approximating the 
length of a parametric curve 


|D—d(P)| <e. 


How this leads to an integral is explored in the next exercises. The construction 
is illustrated in Fig. 6.7. 


6.7.2 Exercises 


1. The upper unit semicircle is the curve y = /1 — x? for —1 < x < 1. Write the 
integral for the arc length from x = 0 to x = c, where 0 <c < 1. 
Note. The arc length is of course arcsin c. The integral obtained offers a geometrically appealing 
way to define the circular functions rigorously and will be used for this purpose in the next 
chapter. 

2. The upper arc of the ellipse x*/a? + y?/b* = 1, with semi-major axis a and 
semi-minor axis b is the curve 


b 
y=-Va2—x?, -a<x <a. 


a 


It is often convenient to express properties of the ellipse in terms of a, and the 


eccentricity e, defined as 
b2 
e= 1 = me 


Given c such that 0 < c < a, express the integral for the arc length of the ellipse, 
from x = 0 to x = c, in terms of a, e and c. 

Note. The apparent difficulty of computing this integral (given that a 4 b) culminated in the 
theory of elliptic functions early in the nineteenth century. 

3. Show that the length of the curve y = f(x) between x = a and x = b, when it 
exists, is actually the supremum of S(P) taken over all partitions P. Show the 
same for the distance travelled along a parametric curve, that D is the supremum 
of d(P) taken over all partitions. 

Note. The arc length is often defined as the supremum of the lengths of inscribed polygons. 
One might suppose that one could define the area of a surface as the supremum of the areas of 
inscribed polyhedra. However, this does not work, as it leads to infinities. 
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4. Suppose that the parametric curve x = f(t), y = g(t), a < t < b has the prop- 
erties that f and g have continuous derivatives in Ja, b[ that extend to continuous 
functions in [a, b]. Show that 


b 
D= V f'(t)? + 9 (t)? dt. 


This exceedingly natural formula shows that the distance travelled is the integral 
of the speed, as we always knew it was. 
Hint. One can write 


m-—1 


d(P) = > J f'(au)? + 8'(Bu)? (test — te) 


k=0 


where, for each k, the numbers a, and fx are in the interval Jt, t,+,[. As it stands 
this is not a Riemann sum. One has to move the point 6; to a,, thus obtaining a 
Riemann sum, and estimate the total change. One can use the uniform continuity 
of the function (g’ )? on the interval [a, b], and that of the function ./x on the 
interval [0, M], is an upper bound for f’ + g” on the interval [a, b]. 


6.7.3 Volumes and Surfaces of Revolution 


Calculus books intended for users of mathematics introduce methods for calculating 
the volume of a body of revolution and its surface area. Although these concepts 
belong properly to multivariable calculus, they reduce to integrals of functions of 
one variable. They can be motivated by the same type of approximation as we used 
to motivate the integral ff as the area under the graph y = f(x). The full details 
will not be given here but the reader who has tackled Exercise 4 should be able to 
supply them. 


Volume of Revolution Integral 


Let f be a positive function with the domain [a, b]. The plane region bounded by 
the graph y = f(x), the x-axis, and the lines x = a and x = D is rotated in three 
dimensions about the x-axis to form a solid. One can introduce a third coordinate 
axis, the z-axis, at right angles to the (x, y)-plane to effect the rotation analytically. 
Let P = (fo, ti, ..., tn) be a partition of [a, b]. We can approximate the volume from 
below by the total volume of a collection of cylinders, thus, 


m-—1 


Viower(P) = mi (te = tk) 
k=0 


and from above thus, 
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m—1 
2 
Vupper(P) = SinMe (that — tk) 
k=0 


where, we recall, mg = infty,.1,,, f and Mg = supy, 1,,,) f- We are proceeding intu- 
itively here; a definition of z building on analysis alone will be given in the next 
chapter. 

Refinement of the partition leads to the volume of revolution integral 


b 
V= nf f(x) dx. 


Surface of Revolution Integral 


A similar process leads to an integral for the surface of revolution. Using a partition 
P we inscribe a polygon in the curve y = f(x), the same as we used to obtain arc 
length. Rotate this polygon about the x-axis. The bit between x = % and x = t4) 
turns into the frustum of a cone with surface area 


mf (te) + f(ter1))V tent — te)? + CF (te) — £(H))?. 


The sum of these leads to the surface of revolution integral 


b 
A= an f f(xa)J1+ fi)? dx. 


6.7.4 Exercises (cont’d) 


5. Prove a theorem of Archimedes: the area of a parabolic segment cut off by a chord 
is 4/3 times the area of the triangle whose base is the chord and whose height 
equals the height of the segment measured from the chord. 

Hint. It can help to set up coordinates in a convenient way. The following is 
only a suggestion. Take the origin at the midpoint of the chord and the y-axis 
parallel to the axis of the parabola. Let the equation of the chord be y = mx and 
its endpoints (/, ml), (—1, —ml). Show that the parabola is one of a one-parameter 
family 

y= cl? _ 7) +mx 


where we may assume that the parameter c > 0 (this just means that the parabola 
opens downwards). Now compute the area A of the segment by integration, and 
the maximum area of a triangle with vertices at (J, ml), (—1, —ml) and the third 
vertex on the arc joining (/, ml) and (—/, —ml). 

6. Prove a theorem of Archimedes: when a sphere is inscribed in a cylinder (which 
then has the same radius as the sphere) and both are cut by two parallel planes at 
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right angles to the cylinder, the two planes cut equal areas off the sphere and the 
cylinder. 


6.7.5 Pointers to Further Study 


— Multivariable calculus 
— Differential geometry 


6.8 () Approximation by Step Functions 


The small oscillation theorem (Proposition4.14) admits another interpretation. It 
says that every continuous function on the interval [a, b] can be approximated by a 
step function in a rather precise sense. Let f be continuous on [a, b]. Then for all 
€ > O there exists a step function / with the same domain, such that for all x in [a, b] 
we have 


I f(x) —h@)| <«. 


All we need to do to construct h is form a partition fg < ft) <--- < t, of [a, b], such 
that the oscillation of f on each subinterval is less than ¢. Then we define h to be 
constant on each subinterval, with value equal to f at its midpoint, for example. 

The graphs of the two functions, f and h, remain close, with error less than ¢ 
throughout the whole interval [a, b]. This is called a uniform approximation of f by 
a step function. 

Certain functions other than continuous ones can be uniformly approximated 
by step functions to arbitrary accuracy, monotonic functions for example. Sup- 
pose that f is increasing on [a, b]. Let e > 0. Partition the interval [ f(a), f(d)] 
into subintervals of length less than ¢. For example we can let yo = f(a), ym = 
f(b), choosing m so that (f(b) — f(a))/m < ¢. Then we construct the partition 
yo < Yi < y2 <-+++ < Ym With subintervals of equal length. Next we let Ag be the 
set of points in [a, b], such that yo < f(x) < y; and then, for k = 1, 2,...,m—1, 
we set A, equal to the set of points in [a, b] such that y, < f(x) < yer. 


Exercise Show that each set A, is an interval. It may be empty; if it is not empty, 
it may contain neither of its endpoints, one of its endpoints, or both of them. Draw 
some pictures illustrating each of these possibilities. Show also that if j < k, then 
for all s in A; and ¢ in A; we have s < t. Show finally that the union of the sets A; 
is all of [a, b]. 


To construct a step function / that approximates f uniformly with error less than 
€, we define h(x), for x in Ax, to be equal to 5 (YE + yr+1), except at endpoints of 
Ax (should either of them be in A;). At every endpoint we let h be equal to f. 
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This construction leads to a step function h, such that for all x in [a,b] we 
have | f(x) — h(x)| < ¢. We even have h(a) = f(a) andh(b) = f(b), and h is also 
increasing, which all turns out to be quite useful as we shall see. We can use it to 
prove the second mean value theorem for integrals, a result that one can sometimes 
turn to when all else seems to fail. 


Proposition 6.23 Let g be integrable on the interval [a, b] and let f be monotonic 
on the same interval. Then there exists & in [a, b], such that 


[te=to ferro fs 


Proof The proof is lengthy, so be prepared. We may suppose that f is increasing. 
The proof is in two main steps. 

Step 1. We prove the result in the case that f is a step function. Let a = f < t) < 
+++ < ty = b bea partition of [a, b] and suppose that 


f(t =ce, te<xX <tr, k=0,1,..m—1. 


The values of f at the endpoints can be quite arbitrary, but we require that f is 
increasing, which implies that the sequence cx is increasing. 

Let G(x) = ie g foreach x in [a, b]. Then G is continuous and satisfies G(a) = 0. 
Now we have 


m—1 m—1 


b the 
/ fg= ya f g= >) c(Gtey1) — G(h)) 
7 k=0 tk k=0 
m—-1 m—-1 
= 0 (cer G(r) — KEG) + YC = Cee G (te) 
k=0 k=0 


m-1 


= CmG(tm) + (ck = ce41)G (te41). 
k=0 


Recalling that ¢,, = b and G(a) = 0 we obtain 


m—1 


b 
f(b)G(b) — / fg= Yo (ceus — Cc )G (tei) + (fF (8) — Cm)G(B), 
k=0 


so that 


f(b)G(b) — f° fe 
f(b) — fa) 
(Co = F@)G) + I (Cent — CG (test) + f ) = em)GO) 
7 f(b) — f(a) 
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On the right-hand side the coefficients in the numerator that multiply values of G, 
that is, the factors cz — cg, together with co — f(a) and f(b) — cm, are all non- 
negative, and they add up to f(b) — f(a). So the right-hand side is a weighted 
average of the values G(a), G(t,), G(t),...,G(b), and it must therefore lie between 
the maximum and minimum of G(x) on the interval [a, b]. Hence, by the intermediate 
value theorem (and recall here that G is continuous), there exists € in [a, b], such 
that G(&) is equal to this weighted average (compare Sect. 4.3 Exercise 6). It follows 
that 

f)GO) — J? fe 

f(b) — f(a) 


= G&) 


or, equivalently 


1 


b é b 
[ #e= fHGw - (FH - f@)Ge) = Feo | e+ ro | g. 


This completes the proof in the case that f is a step function. 
Step 2. To tackle the general case, we let f be an increasing function and approximate 
it uniformly by a step function. We also introduce a constant K such that |g(x)| < K 
for all x in [a, b]. 

Now let ¢ > 0. There exists an increasing step function /, such that for all x in 
[a, b] we have 


I f(x) —h@)| <¢, 


and, moreover, h(a) = f(a), h(b) = f(b), a pair of equalities that should be borne 
in mind while elucidating the remainder of the proof. 

By the case of a step function (we refer to the penultimate equation in step 1), 
there exists € in [a, b], such that 


h(b)G(b) — [h 
ae) 
(b) — h(a) 


[one [ite 


f(b)G(b) — f? fe 
f(b) — fia 


and since 


b 
= [ \f-hllel < K@- aye 


we have that 


f(b)G(b) — f? fe h(b)G) — f? hg 
fb) —f@ h(b) — h(a) 
| L¢-Ms 
~ | f®)- f@ 
K(b—a)e 
< ar ee ee 
~ fb) —f@ 


G(é) | = 
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and rearranging this we obtain 


K(b—-ae  f(b)Gb) — f° fg K(b—a)e 
G G ————— 
aio fate. FO 
Therefore 
. Kib-ae  f(b)Gb)— f° fg K(b—a)e 
min G < < max G + ———_—_.. 
(aol f(b) — f@) fb) —-f@ (abl f(b) — fa) 


These inequalities hold for all ¢. So we conclude that 


b 
minG < FONG) — Ja fg < max G. 
[a,b] fb)-f@ [a,b] 


Hence, by the intermediate value theorem, there exists 7 in [a, b], such that 


f(b)G(b) — f? fe 
f(b) — fi@ 


= G(n) 


which leads to the required conclusion. 


A much simpler proof of the second mean value theorem for integrals can be given 
using integration by parts, but using the stronger assumptions that g is continuous, 
f differentiable, f’ is continuous and positive (see Sect. 8.2). However, the theorem 
is often useful when g has discontinuities, as we shall see in the chapter on improper 
integrals. 

A corollary, the proof of which is left as an exercise, is called Bonnet’s theorem. 


Proposition 6.24 Let g be integrable on the interval [a, b] and let f be increasing 
and positive on the same interval. Then there exists € in [a, b] such that 


[ te=10 ['« 


We now have two classes of functions that can be uniformly approximated by step 
functions: the continuous functions and the monotonic functions. We can reasonably 
ask what these classes have in common. Functions of both classes are integrable. 
However there are integrable functions that cannot be uniformly approximated by step 
functions. This is because there is a very simple necessary and sufficient condition 
for uniform approximation by step functions to be possible: that at every point the 
one-sided limits f(x—) and f (x+) exist (though only f (a+) ata and f(b—) at b). 
This is, of course, satisfied by continuous functions and by monotonic functions. The 
proof of this is not hard but uses the Heine—Borel theorem, which lies just outside 
the scope of this work. 
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Exercise Find an integrable function for which one of the one-sided limits fails to 
exist for at least one point. 


Even though we cannot always approximate an integrable function uniformly by 
step functions, all is not lost. A different type of approximation is available, also by 
step functions, known as approximation in the mean. 

Let f be integrable on [a, b] and let ¢ > 0. Then there exists a step function h 


such that : 
/ |f —h| <e. 


This means that, on average, or to use the correct term, in the mean, / is close to f. 
At the same time there may be points where f — h is big, even arbitrarily big, but 
this can only occur on “small” sets of points. All we have to do to produce h is to 
find a partition P for which U(f, P) — L(f, P) < € and set h equal to mx (or Mx) 
on the subinterval |, t.+1[. 

In fact, approximation in the mean could have been used instead of uniform 
approximation to prove the second mean value theorem, as the reader should be able 
to check by looking over step 2 of the proof. However, nothing is gained in generality 
as monotonicity of the function f seems to be essential. 

The ideas explored in this nugget are very valuable and capable of much variation; 
in order to prove something about a whole class of functions we may first be able to 
prove it for a class of simpler functions (in this case step functions) and then use an 
approximation technique to obtain the conclusion in general. 


6.8.1 Exercises 


1. Prove Bonnet’s theorem (Proposition 6.24). 

2. A function f possessing one-sided limits at each point of its domain has been 
called a regulated function (notably by Bourbaki). Prove that the following con- 
dition is necessary and sufficient for a function f with domain [a, b] to be reg- 
ulated: for each ¢ > 0 and x in [a, b] there exists 5 > 0, such that for all s and 
t in [a, b] that satisfy either x —d <s <t<xorx <s <t<x+6 we have 
[f(s) — f(D <e. 

3. Let f be integrable on [0, 1]. Prove the following limits: 


1 
(a) lim f(x)x" dx = 0. 
n—- Oo 0 
1 
(b) lim f(x) sinnax dx = 0. 
n—-> Oo 0 
Hint. Let 0 < a < b < 1 and do them for a function f equal to 1 fora <x <b 
and equal to 0 otherwise. Extend the conclusions to the case when f is a step 
function and finally use approximation by step functions in the mean. Use your 
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school knowledge of the circular functions sin x and cos x and their derivatives. 
Note. The second limit is a key result in the theory of Fourier series. 


6.8.2 Pointers to Further Study 


— Functional analysis 
— Heine—Borel theorem 


Chapter 7 M®) 
The Elementary Transcendental si 
Functions 


Whatever sines, tangents and secants 

Present you with after lengthy and heavy labour; 

Fair Reader, this little table of logarithms will give you, 
Without serious toil immediately. 


J. Napier 


It is a part of all analysis courses to give definitions of the elementary transcendental 
functions, namely trigonometric functions, logarithms and exponentials, that build 
on ideas of analysis only and do not rely on geometry. In spite of the last denial, 
the definition of the trigonometric functions in the analysis literature may often owe 
something to geometric intuition, whilst not needing it logically. This is the case 
here. 

Before that though a word of explanation on the terminology is due. The term 
“transcendental” does not reflect any particular wow-factor nor is it related to mysti- 
cism. It refers to functions that cannot be built up starting with the function x, using 
constants and algebraic processes. For example polynomials and rational functions, 
together with roots of polynomial equations such as ./x, are not transcendental; they 
are algebraic. Even the function y = f(x), defined as the unique real root of the 
equation y° + y + x = 0, is algebraic, although we have no closed expression for it. 

A precise definition of transcendental function is that it is a differentiable function 
f, that does not satisfy a polynomial equation; that is, there is no polynomial P(x, y) 
of two variables, such that P(x, f(x)) = 0 for all x. One really needs more generality 
here by allowing the coefficients of P, as well as the variables x and y to be complex 
numbers. See Chap. 9 for a brief introduction to complex numbers. 

The term “elementary” refers here, rather arbitrarily it must be said, to a set of 
functions that were available to mathematicians before the advent of calculus, and 
were needed in geometry and arithmetic. They are of course just those functions 
that are met with in high-school algebra, the trigonometric functions, exponential 
functions and logarithms. As new transcendental functions were introduced, spurred 
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largely by the work of Euler in the eighteenth century, who defined for example the 
Gamma function, the epithet “elementary” began to be attached to the previously 
known ones, whilst the new functions began to be called special functions, a term 
which should probably be disparaged as having no useful meaning at all, unless it is 
“non-elementary transcendental function that has received a name”’. 

Much of the material of this chapter, though not the way it is developed, will 
be familiar from school mathematics. Where possible we move the text along quite 
briskly, with short paragraphs producing the well known classical formulas, whilst 
dwelling longer on the less familiar parts. 


7.1 Trigonometric Functions 


In the simplest geometric manifestation that goes beyond their use to solve triangles, 
the functions sine and cosine, known as the circular functions, provide a parametri- 
sation of the unit circle in the Euclidean plane, that starts at the point (1, 0), travels 
anticlockwise and has speed |. This is the reason for the name “circular functions”. 

To be more precise the circle x7 + y? = 1 is parametrised by setting x = cost 
and y = sinf; the parametrisation satisfies ,/x’(t)? + y’(t)? = 1 (that is, the speed 
is 1), the starting point (x(0), y(O)) is (1, 0), and the direction of increasing f is 
anticlockwise. In this context, anticlockwise simply means passing the points (1, 0), 
(0, 1), (—1, 0), (0, —1) in that order; this choice of direction probably seems arbitrary 
to all but mathematicians. 

We are going to define these functions using analysis alone, but underlying our 
approach is the idea of moving with speed | along the unit circle. Speed is a familiar 
everyday concept; a car’s speedometer measures it for example. Underlying it is arc 
length; it is measured by the car’s milometer and is also an everyday concept. Arc 
length is a much simpler concept than area, which many authors have used to motivate 
a rigorous definition of sine and cosine. One suspects a shift from thinking that area 
is simpler than arc length to the opposite, that reflects a society in ever-increasing 
motion. 


7.1.1 First Steps Towards Defining Sine and Cosine 
First we define arcsine. For all x in the interval |—1, 1[ we define 


a 1 
arcsin x = ———_ dt. 
i V1—t? 


Underlying this is the idea that for x > 0 the integral is the length of the arc of the 
unit circle x? + y? = 1 from the point (0, 1) to the point (x, /1 — x2). The angle 
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Fig. 7.1 Defining arcsine by 
arc length - arcsin & 


that this arc spans is measured in radians by the length of the arc (this is just the 
definition of radian), and its sine by trigonometry is x (Fig. 7.1). 

The function arcsin x as just defined as strictly increasing on the interval ]—1, 1[ 
is an odd function, (that is arcsin(—x) = — arcsin x, compare Sect.6.4 Exercise 9), 
and by the fundamental theorem satisfies 


1 
— arcsinx = 
d 


= Slax (-1 <xX< 1). 


These facts are immediate consequences of the definition that the reader is invited to 
check. 
Since for 0 < t < 1 we plainly have 


1 1 
< 
V1—-1t2 JSl-t 


it follows, for 0 < x < 1, that 


arcsinx < 


x 1 x 
dt =—-2V/1—-—t}| =2-—2V1—-x <2. 
eae ec 
The limit lim,_, ;_ arcsin x therefore exists (since the function is bounded above and 
increasing), and is less than or equal to 2. We define the number z by setting 


ue : . 
— := lim arcsinx. 
2 x>1- 


Since arcsin x is an odd function we also have 


‘ , 8 
lim arcsinx = —— 
x>-1+ 


This definition of z virtually establishes it as half the perimeter of the unit circle; 
about as classical a definition as one could wish for. 

We define sinx (the sine of x) for —2/2 < x < 2/2 as the inverse function to 
arcsine. The function sin x is then strictly increasing odd, and carries the interval 
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]—z /2, 2 /2[ on to the interval ]—1, 1[. The reader can convince themselves that this 

is a sensible definition by trying the geometry exercise illustrated in Fig. 7.2. 
Consider the relation y = sin x, still only defined for —2/2 < x < 1/2. We are 

going to compute the derivative dy/dx. Inverting, we find that x = arcsin y and 


hence 
dx 1 


so that 


d 
w=V1-y=(-y)!. 


dx 


Differentiate again, as we obviously may, using the chain rule. We find 


Nie 


d 1 1 d 1 1 
fm Sy7)2( 2y) = = (1 — y*)-2(-2y)(1 — y*)? = -y. 


dx? 2 2 


The function sin x therefore satisfies the differential equation 


d’y 
—+y= @) 
dx2 > 
on the interval —2/2 < x < 2/2. The equation is usually written as y” + y = 0. 
We observe that sinx satisfies the conditions (using the notation explained in 
Sect. 5.3) 


? 
o=0, —| =1. 
y(0) Telce 


It is usual to write the second condition as y’(0) = 1. Another thing to note is that 


io. th A : d 
lim —sinx= lim —sinx =0. 
x>5-— aX X>-F+ AX 
Fig. 7.2, A geometry y 
exercise Given that Using the definition 
the arc from (1,0) to P of arcsine in the text 


has length t show that P has coordinates 
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7.1.2 The Differential Equation y” + y =0 


The function f(x) = sin x, defined at present in the interval —7/2 < x < 1/2, is 
only one of infinitely many functions that satisfy the differential equation y” + y = 0. 
In general, by a solution to the differential equation y” + y = 0 in an interval 
Ja, b[, we shall mean a twice differentiable function f:]a, b[— R, that satisfies 
f"(«) + f(x) = 0 for all x in Ja, b[. There are infinitely many such solutions. From 
one we can make others by the operations of translation, reflection and differentiation. 
From two we can make others by taking linear combinations. These claims are 
summarised in the next proposition, and their proofs are left to the reader. 


Proposition 7.1 


(1) If f (x) isa solution of the differential equation y" + y = 0 in the interval Ja, DI, 
then the function g(x) := f(x +c) is a solution in the interval ja — c,b — c[ 
and the function h(x) := f(—x) is a solution in the interval |—b, —al[. 

(2) All solutions to y” + y = Ohave derivatives of all orders and they are themselves 
solutions. 

(3) If f and g are solutions in the interval ja, b[, and A and B are constants, then 
Af + Bg is also a solution in Ja, D[. 


7.1.3 Extending sin x 


We extend the function sin x beyond the interval ]—z /2, 2 /2[ on to the whole number 
line R, in such a way that the extended function satisfies the differential equation 
y"” + y= 0. 

Begin with the function sin x on the interval ]—z/2, 2 /2[. We are going to describe 
operations on the graph y = sinx. Each operation is one of the transformations 


Fig. 7.3. Extending sin x 1 1 


glue translate 
v v 
1 1 
ge 
— in glue z Tn 
2 2 2 2 
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described in Proposition 7.1. First reflect the graph about the line x = 0. Next translate 
the reflection rightwards to give a new graph on the interval ]/2, 32 /2[. Glue this 
graph to the first graph. This produces an extension to the interval ]—2/2, 37 /2[. 
The extension process is illustrated in Fig. 7.3. 

At the join x = 2/2 the two graphs (the first graph and the reflected and translated 
graph) have the same y-value, namely 1, and the same y’-value 0. Both graphs also 
satisfy y” + y = 0 so that they have the same second derivative also. The extension 
therefore satisfies y"” + y = 0 in the interval ]—z/2, 37 /2[. 

The extension just defined has the same values of y, y’ and y” at the endpoints 
—m/2 and 32/2. The length of this interval is 277. We can therefore extend it by 
repeated translation to the whole number line R in such a way that we obtain a 
function with period 27. This function will also satisfy the differential equation 
y"+y=0. 

Exercise Explain why the result of glueing together the pieces of the graph in the 


above construction produces a function that has derivatives of all orders everywhere, 
unlike the function constructed in an apparently similar way in Sect. 5.4 Exercise 10. 


7.1.4 Defining Cosine 
We define cosine by setting 
cosx = sin (5 -x), xeER. 


Then by Proposition 7.1 the function cos x is a solution to y” + y = 0. We note that 
it satisfies the conditions y(0) = 1, y’(0) = 0. 

Now we conclude that the function A cos x + B sin x is a solution to the differ- 
ential equation y” + y = 0 that satisfies the initial conditions y(0) = A, y’(0) = B. 
Much more is true. 


Proposition 7.2. The function A cos x + B sin x is the unique solution to y” + y = 0 
that satisfies y(0) = A, y'(0) = B. 


Proof Let f bea solution of y” + y = 0 that satisfies the same initial conditions as 
Acosx + B sin x. Set 


g(x) = f(x) — (Acosx + Bsinx). 
Then g is a solution to y” + y = 0 that satisfies g(0) = g’(0) = 0. Let 
P(x) = Ox) + 8')’. 
We then have 


b (x) = 2g (x)g'(x) + 29'(x) 9" (x) = 2g'(x)(9"(x) + B(x) = 0. 
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We conclude that ¢ is a constant. But this constant is 0 because (0) = 0. Since we 
now have ey + g'(x)* = 0 for all x we conclude that g(x) = 0 for all x. 


We shall rely greatly on this proposition to derive properties of the circular functions. 
The method serves as a model for how to obtain properties of transcendental functions 
from the differential equations that they satisfy. 


7.1.5 Differentiating cos x and sin x 
Proposition 7.3 


— sinx = cosx, —cosx = —sinx. 
x 


dx 


Proof Since the function y = sin x satisfies y’ = ,/1 — y? and y” = —y, we have 


2 


d : ad : 
— sinx = /1 —(sinx)?, —~sinx = —sinx. 
dx dx? 


From this we get 


By Proposition7.1 (item 2) the function (d/dx) sinx is a solution to y’ + y = 0, 
and we have just seen that it satisfies the same initial conditions at x = 0 as does 
cos x. It therefore equals cos x by Proposition 7.2. 

Finally we obtain 


7.1.6 Addition Rules for Sine and Cosine 


As usual we shall use the notation sin?x to mean (sin x)” (and not sin(sin(x)), and 
similarly for all positive integral powers. Negative powers are written differently; 
sin~!x, if used at all, denotes the inverse function arcsin x and not I /sin x. 


Proposition 7.4 For all x and y we have the addition formulas 


sin(x + y) = sinx cos y + cosx sin y 
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cos(x + y) = cosx cos y — sinx siny. 
Proof Hold y fixed. Set 


F(x) := sinx cos y + cos x sin y 
G(x) := sin(x+ y). 


The function F satisfies F” + F = 0, F(0) = sin y, F’(0) = cos y as the reader can 
easily check. The function G satisfies the same differential equation and the same 
initial conditions. We conclude that F and G are the same function. This gives the 
first rule. 

For the second rule write 


cos(x + y) = sin (F -—x- y) = sin (5 - x) + (-y)) 


and apply the first rule. 


Proposition 7.5 
cos*x +sin?x =1, (x ER). 


Proof We have cos 2x + sin?x = cos(x — x) =cos0 = 1. 


7.1.7 Parametrising the Circle 


Proposition 7.6 Cosine and sine provide a parametrisation of the unit circle. If x = 
cost and y = sint then the point (x, y) travels once around the circle x” + y* = 1 
as t goes from 0 to 21; more precisely each point on the circle is passed once as t 
ranges over the interval [0, 2z[. 


Proof Let (a, b) satisfy a* + b* = 1. We have to show that there exists a unique f in 
[0, 27[, such that a = cost and b = sint. Assume first that —1 < a < 1. There exist 
a unique t; € ]0, z[ and a unique fy € Jz, 27[, such that cost; = a and cost, = a, 
since cosine is strictly monotonic on each of these intervals and maps each of them 
on to J]—1, 1[. Then we must have either sin t; = b or sin f2 = b, but not both because 
sin t; = — sin fy. Finally if a = —1 we must have t = z, and if a = 1 we must have 
t=0. 


7.1.8 The Trigonometric Functions tan x, cot x, sec x, csc x 


We define tangent, cotangent, secant and cosecant as 
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sin x cos x 1 
tanx = , cotx=— > secx = >» cscx = — F 
cos x sin x cos x sin x 


Each of these functions is undefined at some points, namely at the points where the 
denominator is zero. Together with the circular functions sin x and cos x and their 
inverses, these functions and their inverses make up the collection of trigonometric 
functions. The name “circular function” is widely applied to all these functions. Their 
derivatives are (the reader may check them) 


—tanx = sec?x, no eee 


dx x 


—cotx = —cesc?x, — cscx = —cscxcotx. 
dx dx 


The tangent function tan x has period z and its graph has vertical asymptotes at 
odd multiples of 7/2. Its restriction to the interval ]—z/2, 2/2[ is strictly increasing 
and maps that interval on to R. Its inverse function, arctangent, is important. It 
maps R on to ]|—z//2, m/2[. Let us find its derivative. Begin with y = arctan x. Then 
x = tan y and dx/dy = sec”y. Thus we find 


dy 1 2 cosy 1 1 
= = cos*y= - = - ; 
dx  sec*y cos?y+sin2y 1+tan?y 14x? 


7.1.9 The Derivatives of arcsin x, arccos x and arctan x 


All of these derivatives are important for integration because they provide us with 
some new and useful antiderivatives. They are 


F 1 d 1 
— arcsin x = ——., — . arccos x = ————_—.,, 
dx V1 — x? dx V1 — x? 
d 1 
— arctan x = ——. 
dx 1+ x? 


Firstly, these formulas provide the antiderivative 


1 
/ ———- dx =arcsinx ++ C or —arccosx+C. 
V1 — x2 
These two versions indicate why mathematics teachers lay so much emphasis on 
including the constant C. On the interval ]—1, 1[ (where the integrand is defined) we 
have arcsinx = 4 — arccos x. 


2 
Secondly, and it is one of the most useful antiderivatives, we have 
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1 
eerie =arctanx +C. 
1+ x? 


The interval of definition here is all of R. 


7.1.10 Exercises 


1. Check the formulas for the derivatives of tan x, sec x, csc x and cot x. 
2. Obtain the addition formula for tan x (familiar from school mathematics) 


ane) tanu + tanv 
an(u + vy) = ———————_-.. 
1 — tanu tanv 


3. Let t = tan(x/2). Express sin x, cos x and tan x as rational functions of f. 
4. There is an addition formula for arctangent. It is often written simply as 


x+y 
arctan x + arctan y = arctan : 
1-—xy 


Care is required because the range of arctangent is the interval ]—7/2, 2/2[. 


(a) Prove the formula in the case that 
a a 
a < arctan x + arctan y < 5 


(b) How would you modify the formula in the cases that arctan x + arctan y falls 
outside the interval ]—2/2, 2 /2[? 


5. Prove the following formulas, no doubt familiar to you from school mathematics. 
They are important tools for integration. 


(a) 1-+tan?x = sec*x 

(b) sin2x = 2sinx cosx 

(c) cos2x =cos?x — sin*x 

(d) cos2x =2cos*x — 1 

(e) cos2x = 1—2sin?x 

(f) cos3x =4cos*x — 3cosx. 
(g) sin3x =3sinx — 4sin>x. 


6. Let a and b be distinct real numbers. 


(a) Show that the function a cos x + b sin x is periodic with period 277, and oscil- 
lates between the values a? + b? and —Va? + b?. 

(b) Show that the function acos?x + bsin?x is periodic with period 2, and 
oscillates between the values a and b. 


7.1 Trigonometric Functions 247 


(c) Perform a similar analysis on the function a cos 3x + bsin>x. In addition to 
finding its maximum and minimum values determine all other local maxima 
and minima. 


7. We cannot assign an angle ¢ to each point P of the unit circle so that ¢ is a 
continuous function of P. We have to omit one point of the circle. Most commonly 
it is the point (—1, 0) (for some reason this is thought to be the point one is least 
likely to visit; but that varies of course). Omitting this point we assign an angle f 
to the rest of the circle such that ¢ is in the open interval —z < t < 7. With this 
definition for t, derive the following formulas. It is assumed that (x, y) lies on 
the unit circle. 


(a) t = arctan y if x > 0; 
x 


(b) t =arctan> +7 ifx <Oandy>0:; 
x 
(c) t =arctan> — 7 ifx <Oandy <0: 
x 
Wes 
d) ¢ =2arctan if -l. 
(d) ieee x# 


8. Prove that sin x is not an algebraic function. 
Hint. Suppose that a formula f(x,sinx) =O is valid where f(x, y) = 
yo Pe) y* and the coefficients p(x) are polynomials. We can assume that 
Po(x) is not the zero polynomial (if it was we could lower m). From your knowl- 
edge of sin x derive an impossible property of po(x). 


7.2 Logarithms and Exponentials 


As in the case of the circular functions we use an integral to give the primary defi- 
nition, which means that the logarithm is defined first and the exponential function 
then appears as its inverse. Another important differential equation is introduced. 

The exponential function with base a, denoted by a’, generalises the rational 
power a’”/”, Most commonly when we talk about the exponential function, without 
mentioning the base, we have in mind a particular base e, which has the property 
that e* is a solution of the differential equation dy/dx = y. This makes e* the most 
important function in analysis. 


7.2.1 Defining the Natural Logarithm and the Exponential 
Function 


We begin by defining the natural logarithm by 
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“1 
nx = f —dt, (x >0). 
1 ¢ 
Then In] = 0,Inx > Oforx > 1,Inx <0 for0 < x < 1 and 


d 1 
—Inx=-, («> 0). 
dx ee 

This makes In x an antiderivative for 1/x on the interval ]0, oo[, thus filling in an 
important gap in the list of antiderivatives (by cheating as one might say), and solving 
the differential equation y’ = 1/x. 

A point of notation. Some denote the natural logarithm by log x. Others use log x 
for the common logarithm, that is, the logarithm to base 10. We prefer the formation 
“In”, in which the “n” could refer to “natural” or, and this seems more likely, to Napier, 
the originator of logarithms. In fact natural logarithms were often called Napierian 
logarithms. The situation is made all the more confusing by the fact that Napier did 
not use what we now call the natural logarithm but another quantity closely related 
to it. 

The laws of logarithms follow from the differential equation that In x satisfies. 


Proposition 7.7 (First law of logarithms) For all x > 0 and y > 0 we have 
In(xy) =Inx+Iny. 


Proof Fix y and set f(x) = In(xy). Then 


1 d 
PQe se Se ie 
xy x dx 


We conclude that f(x) —Inx is a constant C. By putting x = 1 we see that 
C=lIny. 


Proposition 7.8 The function |n x is strictly increasing on the interval ]0, oo|, and 
we have the limits 


lim Inx =oo and lim Inx = —o. 
x00 x >0+ 


Proof We have 


—Inx= 7 >0 
dx x 

and therefore In x is strictly increasing. Now In(2”) = n In 2 for each natural number 
n by the first law of logarithms, and since In 1 = 0 and In x is increasing we have 
In2 > 0. We conclude, since In x is increasing, that lim,_,.. Inx = oo. Set 5 in place 
of 2 and conclude, since In 5 < 0, that lim,_.9, Inx = —oo. 
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We define the exponential function exp : R — ]0, oo[ as the inverse function to 
In x. Let us differentiate exp x. Set y = expx. Then x = In y and we find 


dx = 1 
dy y 
giving 
dy 
—-=y=expx. 
dx “ P 


We have proved the following. 


Proposition 7.9 The function y = exp x is a solution to the differential equation 


dy _ 
mee 


The first law of logarithms turns into the first law of exponentials. 


Proposition 7.10 For all real numbers x and y we have 
exp(x + y) = expx.exp y. 
Proof Set x =Ins, y = Int where s > 0 andt > 0. Now 


exp(x + y) = exp(dns + Int) = exp(In(st)) = st = expx.expy. 


We define the number e := exp 1. We will soon see that 


ses ooo 
e = 2.7 1828 1828 459045 .... 


The brackets are intended as an aid to memorising the digits. 

The numbers z and ¢ are irrational (we will prove these claims later), and are the 
most important irrational numbers, though no doubt opinions may differ on this, and 
on whether \/2 should be included for historical reasons. 

We now have expn = e” for all natural numbers n and it is easy to see that 
exp(m/n) = e”/" for all rationals m/n. The function exp x therefore extends the 
power e’”/” to all the reals. We therefore define e* := exp x for all real numbers x. 


7.2.2 Exponentials and Logarithms with Base a 


Leta > 0. We define a* := exp(x Ina) for all real numbers x. This is the exponential 
function with base a. The general power a’ is not defined in real analysis for negative 
a, at least not as a function. Even so, certain values exist; for example —1 has a real 
cube root. 
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The case a = | is uninteresting, since then a” is the constant |. If a > | then 
Ina > 0; a* is then strictly increasing and maps R on to ]0, oo[. If 0 < a < 1 then 
Ina < 0; a* is strictly decreasing and maps R on to ]0, ov[. 

The inverse function to a* in the case a # | is the logarithm with base a. It is 
denoted by the function symbol log,. In other words the equation y = a* inverts 
to x = log, y. In practice we only use logarithms with base higher than 1, most 
frequently e (the natural logarithm) or 10 (the common logarithm). We continue to 
denote log, x by In x. 

If a > 1| then the function a* is strictly increasing and maps R on to ]0, oof. It 


follows that its inverse function log, is also strictly increasing and maps ]0, oo[ on 
toR. 


7.2.3 The Laws of Logarithms and Exponents 


Proposition 7.11 For all a > 0 and all x, y inR, we have 


(1) at? =a’ -a’, (first law of exponents). 

(2) (a*)? =a”, (second law of exponents). 

For alla > 0, b > O and all real x, we have 

(3)  (ab)* = a*b*, (third law of exponents). 

For alla > 0 excepta 4 1, and all x > O and y > 0, we have 
(4) log, (xy) = log, x + log, y, (first law of logarithms). 
For all x > 0 and all real y we have 

(5) log, (x”) = y log, x, (second law of logarithms). 

Proofs (1) aety = ety) ina = et napylna =q'"a’. 

(2) (a*)’ = ey n@) = e) Ines") = ex ina =q. 

(3) (ab)* = e* Intab) = et Inatxinb = et Inagxinb =q"*b’. 
(4) Let log, x = s and log, y = tf. Then 


so that 
s+t=log,(xy). 


(5) Letlog,x = t. Then 
v= (ay =a”, 


so that 
yt = log, (x). 
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7.2.4 Differentiating a* and x“ 


Let a > 0 be aconstant. Then 


d . d . 
x ox Ina _— (In aje* Ina 


ax? a5 =(Ina)a*, (—o <x <0). 


On the other hand (and here we conclude the story of differentiating x“ begun when 
a was an integer): 


d d 


; a a ce 
— xt = — palms = palnx 7 yt = gy! I (0<x < oo). 
dx dx x x 


The restriction to positive x is only made because a is unspecified. For certain values, 
for example when a is an integer or a rational with odd denominator, we can allow 
negative x. 


7.2.5 Exponential Growth 


The functions e* and e~* overpower all power functions x“ as x — oo. Conversely 
the power functions overpower In x. 


Proposition 7.12 Leta > 0. Then we have the limits: 


(J) lim — ee) 
x—>0o x4 

(2) lim x%e* =0 
X—0O 

(3) lim x“Inx =0 
x—>0+ 

. Inx 

(4) lim = 0. 

x00 x4 


Proofs (1) The quickest way to obtain this, and indeed the other limits, is to use 
L’Hopital’s rule in the oo/oo version (Proposition5.16). Repeated differentiation 
of numerator and denominator of e*/x“ leads eventually to e*/cx? with b < 0 and 
c > 0. This tends to oo. 

Another, and very natural, proof of this limit builds on the power series e* = 
ye yx” /n!. However we do not yet have this at our disposal. 

(2) This is the reciprocal of the first limit. 

(3) We have x“ In x = Inx/x~“ and the denominator tends to oo as x > 0+. 
Differentiating numerator and denominator gives —x“/a with the limit 0 as x — 0+. 

(4) Differentiating numerator and denominator leads to 1/ax“ with limit 0 as 
x > ©. 
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7.2.6 Hyperbolic Functions 


We define the hyperbolic sine, hyperbolic cosine and hyperbolic tangent. These are 
defined for all real x by 

sinh x = eee coshx = pas, tanhx = E 
2 2 


Occasionally one sees also the functions 


1 1 
sechx = , eschx = ——, cothx = ——. 
cosh x sinh x tanh x 


We state the most important properties of the hyperbolic functions for ready reference, 
leaving the proofs to the exercises. 
For all x we have the important formula 


cosh?x — sinh?x = 1. 


This means that the parametrised curve x = cosht, y = sinh? is the hyperbola 
x? — y* = 1, and it explains the epithet hyperbolic. 
The derivatives of the three important hyperbolic functions are 


d d d 
— sinhx =coshx, —coshx =sinhx, —  tanhx = = 
dx dx dx cosh* x 


The inverses of the hyperbolic functions are used in integration, as they furnish 
new antiderivatives. The function sinh is strictly increasing, odd and maps R on to R. 
The function cosh is even, strictly increasing on ]0, oo[, and maps this interval on 
to ]1, co[. The function tanh is strictly increasing, odd and maps R on to ] —1, I[. 
We therefore have three inverse functions, each defined with a different domain, and 
bijective with the indicated codomains: 


sinh7!:R—R, cosh7!:]1,oof[ > JO, oof, tanh7!:]-1,1[ > R. 


Their derivatives are important as they provide valuable antiderivatives: 
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The following rules are frequently used to provide alternative versions of the 
antiderivatives of the derivatives in the previous display: 


sinh !x = In(x + Vx? + 1) 
cosh7!x =In(x + Vx?2-—1), (x > 1), 
1 
-*), iSpy: 


1l-x 


1 
tanh7!x = in( 
2, 


7.2.7 The Differential Equation y’ = ky 


We know that (d/dx)e** = ke*. The function y = e** is therefore a solution of the 
differential equation dy/dx = ky. The function y = Ce (where C is a constant) is 
also a solution, and it satisfies the condition y(0) = C. It is an important fact that it 
is the unique solution that satisfies this condition. We write the differential equation 
in the short form y’ = ky. 


Proposition 7.13 Let k 4 0. The differential equation y' = ky has a unique solu- 
tion, defined for all R, that satisfies the condition y(0) = C. This solution is 
y =Ce™, 


Proof Let y = ¢(x) bea solution that satisfies 6(0) = C. Then 
d 
Fy GO) = NG) — ke G(x) = CM '(x) — kb) = 0. 


We conclude that e~**@(x) is a constant, and by considering x = 0 we see that the 
constant is C. That is, (x) = Ce*. 


7.2.8 The Antiderivative f (1/x) dx 


We have seen that In x is an antiderivative for the function | /x on the interval ]0, oof. 
What then is an antiderivative for 1/x on the interval ]—oo, O[? For x < 0 we have 


In(—x) = = 
x 


d (—x) 


We have therefore the two antiderivatives, each for the appropriate interval: 


1 1 
[ravainxte (x > 0), [cain +c (x < 0). 
x x 
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Some recommend writing this as one formula [(1/x) dx = In |x| + C, considered 
as valid for all x 4 0. However this is not correct and should be discouraged. 

Here’s why. The domain of 1/x is the union ]—oo, O[ U ]0, oof. We could construct 
an antiderivative on this domain, not of the form In |x| + C, by letting g(x) = Inx 
for x > 0 and g(x) = In(—x) + | for x < 0. This is possible because the domain is 
disconnected. 

It is possible, by using a more general interpretation of differentiation than the 
usual one, to make the equation (d/dx) In |x| = 1/x correct over the whole of R, 
not excluding 0. This is accomplished by the theory of distributions (or generalised 
functions). It is tricky; even | /x has to be reinterpreted, but not as an ordinary function 
assigning to each number its reciprocal, but as a distribution. 


7.2.9 Exercises 


1. Show that for all x > 0 we have 


Inx 


1 tee 
°f10* = 1110 


Note. This has the practical significance that once logarithms to base e have been tabulated the 
logarithms to base 10 can be found by a straightforward multiplicative conversion, requiring 
the single number In 10. A good approximation to In 10 is 2.3026 and a rough value 2.3, the 
usefulness of knowing which is recounted in an anecdote in “You must be joking Mr Feynman”. 
2. Find a formula for the derivative (d/dx) log, x. 
3. The natural logarithm enables us to give a more accurate estimate of the diver- 
gence of the harmonic series than could be obtained in Chap. 3. 


(a) Show that 


1 1 1 
Ina+1)<1+=+=+---+-<14I1nn. 
2 3 n 


Hint. Find lower and upper sums for the integral /; (1/x) dx with a suitable 
partition. 

(b) Give an estimate of how many terms of the harmonic series are needed to 
exceed 100. 


4. Let a, b and c be real numbers and suppose that b < c. Show that there exists 
K, such that x“e* < e% forallx > K. 


5. Prove that - 
| x 
lim (: + -) =e. 
X00 x 
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Hint. Take the logarithm. This is often a good idea in limits involving powers. 


. Continuation of the previous exercise. Refer to Sect.3.8 Exercise 2 and deduce 


that 


lore) 
1 
e= ) = 
n! 
n=0 


Use this to compute e to 12 places of decimals using a simple calculator, more 
precisely, using only the numerical buttons together with | M+ ||+| |= ] and 
MR |. Keep off the factorial button! 


Note. This is a beautiful example of how a highly impractical formula can be transformed into 


highly practical one. To compute e to any accuracy from the limit formula of the previous 
exercise one would have to calculate for example 1.00000001 100009000 | and that only gives 
7 places of decimals and can hardly be computed without using the exponential function in 
some form. From the series about 15 terms are enough to get 12 decimal places, and can be 
computed rapidly. 


. Calculate the following limits (including proofs that they exist): 


(a) lim x 


(c) lm — , where m is an integer. 


(d) lim (tan.x)"™"** 


X>F 


(e) lim,-... In| P(x)|/In|Q(x)|, where P and Q are polynomials, such that 
P has degree m and @Q has degree n. 
(x + 1)* —x* 


(f) =lim ———————.,_ where s is areal constant. 
x00 xotl 


. (Q) Determine the limit 


n 


: 1 
lim, n+k 


by viewing it as a Riemann sum (Sect. 6.6). 
(a) Show that for every power function x” we have 


. _ 3 . 289) 
lim x”e"* = lim x”e* =0. 
X00 Xx——0O 


(b) Show that for all positive integers n we can write 


n 


a3 _ 42 
Pw x" = Py(x)e—* 


where P,,(x) is a polynomial with degree n. 
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10. 


11. 


12. 


13. 


14. 


15. 
16. 
17. 
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(c) Show that, for all positive integers m and n 


n n 
lim x” e* = lim x” 


e* =0 
X00 dx” X—>—00 dx” 


Note. The property of e*, that all its derivatives overpower all powers of x at -Eoo, puts it 

into a class of functions that have been called, rather prosaically, smooth functions of rapid 
decrease. They are important in the theory of distributions, briefly mentioned in the text. 
The equation x” = y* defines a curve (of a sort) in the quadrant of the (x, y)- 
plane for which x > 0 and y > 0. Sketch this curve without the help of a calcu- 
lating device. 

Derive the law of exponents e°t’ = e*e' directly from the differential equation 
y’ = y that is satisfied by e*. Note the role played by the uniqueness part of 
Proposition 7.13. 
Let g be a continuous function on the domain R and suppose that for each c 
the differential equation y’ = g(y) has a unique solution, that satisfies the initial 
condition y(0) = c, and is defined on the domain R. We can define the function 
(x, c) of the two variables x and c, so that, as a function of x, it is the solution 
that satisfies the condition y(0) = c. 


(a) Show that the translate of a solution of y’ = g(y) is again a solution; that is, 
if y = f(x) is a solution and a a number then f(x — q) is also a solution. 

(b) Show that there is a unique solution that satisfies y’(@) = c, and that it is 
d(x — a,c). 

(c) Prove that for all c, s and t we have the formula ¢(s + ft, c) = (5, d(t, c)). 

Note. Incase g(y) = y the formula in (c) is the law of exponents e**’ = e%e’. As the existence 

and uniqueness of solutions are widely applicable properties of differential equations, this 

indicates that the exponential function is capable of great generalisation. 


Prove the following (doubtlessly familiar) formulas. They are important tools 
for integration. 


(a) cosh?x — sinh?x = 1 
(b) 1+ tan?x = sec*x. 


Test the truth of the formula cosh?x — sinh?x = 1 onacalculator that has buttons 
for computing the hyperbolic functions. You will probably find that the output 
changes from | to0 somewhere between x = 10 and x = 20. Why is this? Should 
we conclude that the formula is false in the “real world’? 

Check the formulas for the derivatives of sinh x, cosh x and tanh x. 

Check the formulas for the derivatives of sinh~!, cosh! and tanh7!. 

Prove the formulas already given in the text: 


(a) sinh7!x = In(x + Vx? + 1) 
(b) cosh7!x =InQx + Vx2—1), Ul <x <0) 
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-1 1. 1i+x 
(c) tanh-'x = =In , (-l<x <1). 
2 1-x 
18. We saw that the function cosh~!x is an antiderivative of (x? — 1)~!/? on the 


interval 1 < x < oo. Write down an antiderivative for the function (x2 — 1)~!/? 
on the interval —oo < x < —1, utilising the function cosh”! as it was defined 
in the text with domain ]1, oo[. 

19. Show that the addition formulas for the hyperbolic functions are as follows: 


(a) cosh(x + y) = coshx cosh y + sinh x sinh y 
(b) sinh(x + y) = sinh x cosh y + coshx sinh y 


(on aan tanh x + tanh y 
c anh(x = 
" 1 + tanh x tanh y 


and the duplication formulas (useful for integration): 


(d) cosh2x = cosh?x + sinh?x = 2cosh?x — 1 = 1+ 2sinh?x 

(e)  sinh2x = 2sinhx coshx 

(f) tanh2 2 tanh x 

anh 2x = ————— 

1 + tanh?x 

20. Show that the formulas x = cosht, y = sinht, (—oo < tf < oo), provide a 
parametrisation of the right-hand branch of the hyperbola x* — y* = 1. You 
will need to show that every point on the curve corresponds to a unique value 
of ¢. 

21. Prove that e* is not an algebraic function. 
Hint. Suppose that f (x, e*) = 0 where f(x, y) = oy Pe (x)y* and the coef- 
ficients px (x) are polynomials. One may assume that neither po(x) nor Py (x) is 
the zero polynomial, and that m is the lowest number for which such a formula 
is valid. What happens if one differentiates the formula f(x, e*) = 0? 
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An important role was played by differential equations in the way we defined the 
elementary transcendental functions and obtained their basic properties. The question 
arises as to whether we could have gone further and actually defined cosine and sine as 
the solutions of the differential equation y” + y = 0 that satisfy the initial conditions, 
respectively, y(0) = 1, y’(0) = 0 and y(0) = 0, y’(0) = 1. Similarly whether we 
could have defined exp x as the solution of y’ = y that satisfies y(0) = 1. 

The answer is that this is a perfectly feasible procedure, but to carry it through 
we need to know that these differential equations have unique solutions satisfying 
given initial conditions. This is a proposition that typically comes later in courses 
of analysis, relying, as it does, on the notion of uniform convergence. One does not 
want to delay the definition of the elementary functions longer than is absolutely 
necessary. 


258 7 The Elementary Transcendental Functions 


When the existence and uniqueness theorem for differential equations is in place 
it becomes an invaluable source for defining new transcendental functions. We give 
here a brief preview, without proofs, of what is needed. 

A linear homogeneous differential equation of order n has the form 


Palx)y + pri x)y""? +++ + pilx)y’ + po(x)y = 0. (7.1) 


The coefficient functions p,(x), k = 0, |, ..., 1 are supposed to be continuous func- 
tions on a common open interval A. A solution of the equation on A is an n-times 
differentiable function ¢(x) that satisfies 


Pu(x)h (x) + Pn"? (x) H+ + po(x)O(x) = 0, (ein A). (7.2) 


Special cases are the first-order equation y’ — y = 0 and the second-order equation 
y” + y = 0 that we have already studied and solved. In both these cases A = R. 

The existence theorem for the problem (7.1) applies to an interval A on which the 
leading coefficient function p,(x) has no zeros. If p,(x) has zeros one has to restrict 
to an interval that excludes them before applying the existence theorem. 


Proposition 7.14 If p,(x) has no zeros in A then the set of all solutions of (7.1) on 
A is an n-dimensional vector space of functions over the real field R. 


The proposition implies that if we can find n solutions on A that are linearly 
independent over the reals, then they form a basis for the space of all solutions. 
Every other solution is a linear combination of these n solutions in precisely one 
way. For the first-order equation y’ — y = 0 the solution space is one-dimensional 
and, for example, the function exp x, taken alone, forms a basis. Every solution is 
of the form C exp x for some C. For the second-order equation y” + y = 0, the two 
solutions cos x and sin x form a basis; every solution has the form C cos x + D sin x, 
for some constants C and D. 

The second result concerns uniqueness. 


Proposition 7.15 Jf p,(x) has no zeros in A, if Xo is a point in A and co, C},.-.;Cn—1 
given numbers, then there exists a unique solution of (7.1) that satisfies the initial 
conditions 

y(%o) = co, yo) =e1, on yo) = cat. 


Examples of transcendental functions that are defined by differential equations of 
this kind include the following: 


(a) Bessel functions. These are solutions on the interval ]0, oo[ of Bessel’s equation: 


” 


x?y"” + xy! + (x? —a7)y =0. 
The constant @ is called the order of the Bessel function. 

(b) Legendre functions. These are solutions, usually studied on the interval ]—1, 1[, 
of Legendre’s equation: 
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(1 — x?)y” —2xy’ + e+ Dy =0 


where ¢ is a constant. 
(c) Hypergeometric functions. These are solutions of the hypergeometric equation: 


x(l—x)y"+(c-(a+b+1)x)y' —aby =0 


where a, b and c are constants. They were extensively studied by Gauss and 
include as special cases a vast range of transcendental functions. 


Non-linear differential equations are also a major source of new transcendental 
functions. Indeed we recall that in the interval ]—z/2, 2 /2[ the function sin x is a 
solution of the non-linear differential equation 


Problems in classical mechanics give rise to problems of the form 


y =VPQ) 
where P(y) is a third-degree or fourth-degree polynomial. The solutions can be 
expressed by a new class of periodic functions, the elliptic functions. The methods 


used by Abel and Jacobi, to introduce these functions early in the nineteenth century, 
showed that analysis was an inexhaustible source of new functions. 


7.3.1 Exercises 
1. We defined the circular functions by studying the integral 
i . : dt 
0 Vl-? 
and inverting the function defined by it. In a similar way we can study the more 
general integral 


F(x) = l<x<l 


t 1 
dt, 
I /O- 50>. 


where k is a constant in the range 0 < k < 1. This so-called elliptic integral was a 
major puzzle to mathematicians until Abel and Jacobi, independently so it seems, 
pointed out that one should think of F as an inverse function to an elliptic function. 
In the following sequence of exercises the reader is invited to construct an elliptic 
function using the same steps as were used in the text to construct sin x. 
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2; 


3. 
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(a) Show that F(x) is an odd function and is strictly increasing on the interval 
J-1, IL. 
(b) Show that the limit 
L:= lim F(x) 


x>1- 


exists and that L < 2//1 — k?. 

(c) Theinverse function to F maps the interval |—L, L[ on to the interval ]—1, 1[. 
We have here the function sn; (x), one of a group of functions called Jacobi 
elliptic functions with parameter k. Show that sn; (x) satisfies the differential 
equation 

y+ (1+ k)y — 2k’ y? =0 (7.3) 


in the interval |—L, L[. 

(d) The differential equation (7.3) shares many properties with the equation 
y” + y = 0. Show that if y = f(x) is a solution in an interval Ja, b[, then the 
translation y = f(x —c) is a solution in Ja+c,b+c[, and the reflection 
y = f(—x) a solution in ]—b, —a[. 

(e) Show that the function sn;(x), initially defined in the interval ]—L, L[, 
extends to a function, also denoted by sn; (x), on all of R, that satisfies (7.3), 
and has period 4L. 


For each natural number n we define the function 


n 


Ire) =e aa os 


<) 


(a) Show that f,, is a polynomial of degree n, and is moreover an even function 
when 7 is even and an odd function when n is odd (for the definitions of even 
and odd functions see Sect. 5.4 Exercise 8). 

(b) Show that frii(x) = f(x) — 2xf, (x) forn = 0, 1, 2, ... 

(c) Show that f, has n distinct real roots. 

(d) Show that f,, satisfies the differential equation 


y” — Ixy’ + 2ny = 0. 


Note. The functions f;, are (up to a normalisation constant) the Hermite polynomials. The dif- 
ferential equation y’” — 2xy’ + 2Ay = 0, where A is a real parameter, is known as Hermite’s 
equation. Non-polynomial solutions to Hermite’s equation exist; they are transcendental func- 
tions and some are non-elementary. 


For each natural number n define 


n 


d 2 n 
G(x) = ale he 


dx" 
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Obviously ¢, is a polynomial of degree n. Up to a normalisation constant it is 
the n'" Legendre polynomial. Its zeros are all real and simple, and they lie in the 
interval ]—1, 1[ (see Sect. 5.9 Exercise 7). 


(a) Derive the formulas 
Hogi) = Un + 1)xgj (x) +20 + 7G u(X) 
Ph as(X) = ((x* = Do) (x))' + 2 + Dxg) (a) + (2+ D(0 + 2Gn(X). 
Hint. Attack the expressions 


n+2 n+l 
2 n+l 
Sy? — 1)"*! and 
dxnt2 


2 4yjn+l 
Sate = 1) 
with Leibniz’s formula (Sect. 5.4 Exercise 5). 
(b) Deduce that ¢, (x) satisfies Legendre’s equation with € =n. 
Note. Legendre’s equation also has transcendental solutions, some of which are non-elementary. 
The standard form of the Legendre polynomial is Pp (x) := @n(x)/2"n!. 
4. Some Bessel functions are elementary. 
(a) Show that the functions x~!/? cos x and x~!/? 
with w = 1/2 on the interval ]0, oo[. 


sin x satisfy Bessel’s equation 


To conjure up more examples you can use the following steps: 


(b) We make a change of variables in Bessel’s equation. More precisely we 
introduce a new variable u (really a function of x) related to the variable y by 
u = x “y. Show that y(x) satisfies Bessel’s equation with order a (as always 
we mean in the interval ]0, oo[ ) if and only if u(x) satisfies the equation 


xu” + (2a + 1l)u’ + xu =0. (7.4) 


(c) Show that if u(x) is a solution of (7.4) for a given a then the function 
1 / 
v(x) = —u (x) 
x 


is a solution of (7.4) in which a + 1 replaces a. 
(d) Deduce that if y(x) satisfies Bessel’s equation with order a then the function 


ld 
ttl (<=) (x~*y(x)) 


satisfies Bessel’s equation with order a + 1. 
(e) Deduce that if y(x) satisfies Bessel’s equation with order a then the function 
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satisfies Bessel’s equation with order a +n. 
(f) Deduce that the functions 


wat tay (=) oh canes ce ‘(*) 
x dx x x dx x 


satisfy Bessel’s equation with a =n + s. 


In these formulas the expression + “ is a differential operator that converts 
x dx 


the following function u to u’/x. More precisely 


(<<) u(x) = “u(x. 
la 


Its n"* power is the differential operator that applies <7, tO the succeeding 
function n times in a row. More precisely, we have the inductive definition 


ger ld ld\" 
(5) a (x) (Ga) we), 


7.3.2 Pointers to Further Study 


— Differential equations. 
— Special functions. 


Chapter 8 ®) 
The Techniques of Integration si 


Design is not making beauty, beauty emerges from selection, 
affinities, integration, love. 


Louis Kahn, architect 


This chapter covers a classical set of techniques which essentially can be used to find 
antiderivatives of common functions. The time was when a mathematical education 
placed great emphasis on acquiring skill in using these techniques. Arguably they 
are less important now, but many find that successfully using them to find a difficult 
antiderivative is a satisfying experience. 


8.1 Integration by Parts and by Substitution 


Two immensely important rules for finding antiderivatives correspond, respectively, 
to the rule for differentiation of a product and the chain rule, when these are inverted 
by means of the fundamental theorem. 


Proposition 8.1 (Integration by parts) Let f and g be functions defined in the open 
interval A. Assume that f and g are differentiable and that their derivatives, f' and 
g’, are continuous. Then 


b 


b 
fg = fb)g(b) — f(ag(a) -| fe 


for alla and bin A. 


Proof We have that (fg)’ = fg’ + f’g, which is continuous; so the fundamental 
theorem applies and we have 
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b 
i (fa' + f's) = fa)! = FOs® — fF@g(a). 


The rule is most frequently used to handle antiderivatives in the following way. If u 
and v are functions of x and we know an antiderivative for u’v, then an antiderivative 


for uv’ is given by 
/ ie 
fw = w— fu v. 


Given the task of finding an antiderivative f f, skill and experience may suggest 
suitable functions u and v, such that f = uv’, and an antiderivative for u’v is then 
easy to find. 


Proposition 8.2 (Integration by substitution) Let A and B be open intervals, let 
f : A — R be continuous, and let @ : B — R be differentiable, with ¢' continuous. 
Assume that @(B) C A. Then for all a and b in B we have 


b $b) 
i) foop = / f 
a $(a) 


or, in Leibniz’s notation 


b 6(b) 
/ FW) 4") dt = [ |, fede. 


The use of distinct variables f and x has no logical significance; they are bound 
variables. However it supports the usual interpretation of the rule, that the integral 
on the left is obtained from the integral on the right by means of the substitution 
x = #(t). The substitution replaces dx by ¢’(t) dt, a replacement that is obtained 
formally by writing 


dx ; ; 
x=o(t) > ae) => dx = ¢(t)dt. 


This piece of Leibniz notation, though illegally separating dx and df, is very useful 
in practice. 


Proof of the Rule Let F be an antiderivative for f (one exists since f is continuous). 
The composed function F o ¢@ is defined on the domain B and by the chain rule 


(F oo) = (F'0 $)¢' = (f 0)’. 


We see that (F o ¢)’ is continuous and therefore 
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$(b) b b 
, f=(Fog)b)- (Fos @= / (Fog) = / (f og)¢'. 
g(a) a a 


We note two interesting points: 


(a) Itis not necessary for ¢ to be monotonic. It does not even need to map the interval 
[a, b] on to the interval [¢(a), @(b)], but we do not want ¢(t) to go outside the 
domain of f. 

(b) The rule is often used to transform antiderivatives in the following way. If x = 
(t) then 


/ (Odes : FW) (dt tC. (8.1) 


The equation in point (b) is interpreted to mean that if F is an antiderivative for f then 
F o dis an antiderivative for (f o @) ¢’. Conversely, if we can find an antiderivative 
for (f od) ¢’, let us call it G, then G o ¢~! is an antiderivative for f. This requires 
us to deploy the inverse ¢~'; so we would need to work on an interval in which ¢ is 
monotonic. 

Point (b) indicates two somewhat different ways to apply the rule; they are 
explored in the next section. 


8.1.1 Finding Antiderivatives by Substitution 


Let us look at two examples of the use of substitution to find an antiderivative. 
They illustrate two different ways for applying the rule. In each case we present the 
calculation, as it would normally be presented, and then its explanation. 


(A) Find the integral on the right in (8.1) by solving the integral on the left. 


[vi =Pidt= -5 f (vt=8) ana 
= -5 | veas = —3rvi = -5 —P)/1—2?. 
Explanation. We have V1 — t? tdt = f(@¢ (t))@'(t) dt where 


1 
é(t)=1-2, f(ix)= 5%. 


The antiderivative obtained is valid on the interval —1 < tf < 1. 
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(B) Find the integral on the left in (8.1) by solving the integral on the right. This is 
more complicated than case A because in the final step we have to substitute for t as 
a function of x. This requires the inverse function ¢~!. 

As an example consider the following calculation. For —1 < x < 1 we have 


[vires f vim sin costdt = fost dt 


1 1 1 1 1 
= / 5 + cos 2t) dt = ait 4°08 = ait 5 Sint cost 


1 1 
a) arcsin x + uae. — x?, 


Explanation. We set x = sint, dx = cost dt. We have to choose an interval for ¢ to 


fix an inverse for sin t. The simplest is to keep ¢ in the interval ]—>, >[. Then sin t is 


increasing and maps this interval on to the interval ]—1, 1[. Moreover V1 — sin? t = 
cos f (and not — cos f, owing to the interval chosen for r). Finally x is reintroduced. 


Since —5 <t < $ wehave t = arcsin x and cost = V1 — x? (and not —V1 — x?, 


again, thanks to the choice of interval). 


Traditionally, finding an antiderivative of a function f by the techniques of this 
chapter was called solving the integral [ f (x) dx. Every integral solved can be added 
to a catalogue and used to solve further integrals. 


8.1.2 Exercises 


1. Solve the following integrals: 
(a) i xe* dx 
(b) i x? cos x dx 


(c) fe cos x dx 


Hint. Call the integral J. Integrating twice by parts leads to J = e* sinx + 
e* cosx — I. Look out for other opportunities to use this trick. 


(d) fre cos x dx 

(e) frnvas, (x > 0) 

(f) [xmeas, (x > 0) 
Inx 

(g) [rie (x > 0) 
x 


(h) [ xann?ax, (x > 0). 


8.1 
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. Solve the following useful integrals, where a is a positive constant: 


(c) (——« (x > a) 


Note. In this integral, as in some others in the following exercises, a different domain 
from the one specified is possible, in which case a different formula may be needed for 
the antiderivative. See the discussion of f + dx in Sect.7.1.1. 


1 
(@ / Page” 


. Solve the following integrals: 


(a) [t= Has, (-l<x <1) 


Xx 
b Sg 
o lz - 
(c) lsS« >1) 
(d) Siig 


14+ x? 


Solve the following integrals: 


COS X 
(a) ; ——"_ dx 


1+ sin?x 
cos x 


——_ dx 
JV1+sin2x 


(c) / xe* dx 


(d) france, (-} <= < =) 


(e) [ sectxa, (-=<x<4). 
2 2 


(b) 


. Solve the following integrals: 


(a) ; sin?x dx 


Hint. The standard trick for this and for [ cos*x dx is to use the duplica- 
tion formulas cos 2x = 2cos’x — 1 = 1 — 2 sin’x, as in example B of the 
preceding section. 


(b) cos’x sin’x dx 


(c) : cos*x sin?x dx 
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(d) ; sintx dx 


(e) cosh?x dx 


Hint. The same trick as in item (a) but using the duplication formula for 
hyperbolic cosine (Sect. 7.2 Exercise 19). This also works for f sinh?x dx. 


(f) cos ax cos bx dx where a and b are real numbers. 


Hint. Use the addition formulas for the circular functions. Variants in which 
one or both factors are replaced by the sine are treated similarly. 


. Solve the following integrals: 


(a) [ve + 1dx 
(b) [ve ase os 


Hint. The integrals (a) and (b) occur often. A trigonometric or hyperbolic 
substitution will work, but one can also integrate by parts. The similar inte- 
gral { 1 — x? dx was worked out in the text. 

2 


dx, (x>1). 


laa 


. The integral { sec xdx arises frequently in the course of solving other integrals. 


Solve it by writing 
sec?x + sec x tan. x 


secx = 
sec x + tan x 


and consulting the derivatives of me pone functions listed in Sect. 7.1. 
The most convenient domain is ]— z a Z[ since sec x + tan x is positive there. 


. Solve the following integrals: 


(a) [sect as, (- > <x< =) 


[vi¥eas 
(c) feFar. (x > 0) 


(d) fora +x’) dx 


x—1 
(e) \V« (x > 1) 
x+1 


x-1 d 1 
() IV: ae, (> 1) 
(g) [i+ van. (x > 0). 


8.1 


10. 


11. 


12. 


13. 
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. Show that the function F(x) = { i cos(1/r) dt is differentiable at 0 and compute 


F'(0). This is so, despite the fact that the integrand is discontinuous at 0. 
Hint. Integrate by parts in a cunning way. 
Let f be continuous in an interval A and let a and b be points in A. Show that 


b b 
[sf se+e-nae. 


Can you prove this without assuming continuity, given that a < b and f is 
integrable on [a, b]? 

The length of the upper arc of the ellipse x7/a” + y?/b? = 1 (witha > b > 0), 
from x = 0 to x = c (where 0 < c < a), is given by the integral (asked for in 


Sect. 6.7 Exercise 2): 
© | q2— @2x2 
[ ae dx 


where e, the eccentricity (not necessarily 2.718...), is given by 


Find a substitution that converts this integral to 


arcsin(c/a) 
af V1—e?sin20 dé. 
0 


Note. The integral obtained is one of the three standard forms that Legendre gave for elliptic 
integrals. 

Let f have a continuous derivative on an interval A and let a and b be integers 
in A witha < b. For each real x, let [x] be the highest integer less than or equal 
to x. Prove that 


b b b 
YF) = fs rors | («- [x] - 5) f(x) dx. 
n=a a a 

Hint. Begin by computing the second integral. 

Note. This formula is the simplest instance of the Euler-Maclaurin summation formula, and 
is the first step in proving it. This will be taken up in an exercise in Chap. 12. 

Let f be continuous in the open interval A and possess an inverse function 
f—'. Suppose that an antiderivative F is known for f. Show how to express an 
antiderivative for f—! in terms of the functions x, f~! and F. 

Hint. This is easy if f has a continuous derivative, since one can write 
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14. 


15. 


friefir 


and integrate by parts. This produces a formula for an antiderivative of f—'. Then 
one can try to show that the same formula gives an antiderivative of f—' even 
when / is not assumed to be differentiable. Note that f is strictly monotonic by 
Sect. 4.4 Exercise 4. 

Note. The result is connected with the Legendre transform, Sect. 5.10 Exercise 14. 

Use the method of the previous exercise to solve the following integrals: 


(c) ; arctan x dx 
(d) ; cosh”! x dx. 


Let the functions f and g be continuous in an open interval A. Suppose that 
f is differentiable, and that f’ is continuous and either entirely non-positive or 
entirely non-negative. Show that for all a and b in A there exists € between a 
and b, such that 


b é b 
ig= ra) | or fio) | g. 
a a & 
Hint. Integrate by parts and use the mean value theorem for integrals, Proposition 


6.15. 


Note. This result, the second mean value theorem for integrals, was proved in greater generality 
in Sect.6.8. See also Exercise 17 below. 


The remaining exercises in this section are concerned with extending the rule for 


integration by parts using the notion of primitive. It is a good idea to take another look 
at the definition of primitive, as used in this text and introduced in Sect. 6.5, and recall 
Proposition 6.19 which identified the primitives of piece-wise continuous functions. 
The first, rather mild, extension allows piece-wise continuous integrands and has 
some practical uses, as such integrands occur often in technology. A theoretical 
application will occur in Chap. 12. The second extension goes about as far as is 
possible for the Riemann—Darboux integral. 


16. 


Let f and g be piece-wise continuous functions in an interval A. Let F be a 
primitive for f and G a primitive for g. 


(a) Show that FG is a primitive for Fg + fG. 
(b) Deduce the rule for integration by parts: for each a and b in A we have 


b b 
/ Fg = F(b)G(b) — F(a)G(a) — / f[G. 
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17. (©) There is a very general version of the rule for integration by parts that does 
not rely on derivatives at all. Its proof is based on the approximation of integrable 
functions in the mean by step functions (Sect. 6.8). 

Let f and g be integrable on an interval [a, b]. Let F be a primitive of f and G 
a primitive of g. Then 


b b 
/ Fg = F(b)G(b) — F(a)G(a) -[ FC. 


(a) Prove the formula in the case that f and g are step functions. 
(b) Prove the formula in the case that f and g are integrable. 
Hint. Approximate f and g in the mean by step functions. 


Note. Applying the result of this exercise we get another version of the second mean value 
theorem, where, in the notation of Exercise 15, f is a primitive of a positive integrable function 
and g is integrable. It is still not as general as the version in Sect. 6.8, in which f is merely 
monotonic. 


8.2 Integrating Rational Functions 


Every rational function has an antiderivative that may be expressed using elementary 
functions. The proof of this remarkable fact will occupy the bulk of this section. Some 
knowledge of algebra is assumed. 

A rational function has the form P(x)/Q(x) where P and Q are polynomials. If 
the degree of P is greater than or equal to the degree of Q we can divide Q into P and 
obtain quotient and remainder. The quotient is a polynomial and its antiderivative 
also a polynomial. The remainder is a polynomial with degree less than that of Q. 

We can therefore concentrate on the case of a rational function P (x) / Q(x) where 
the degree of P is lower than the degree of Q. We also assume that Q is a monic 
polynomial, meaning that the leading coefficient of Q is 1. 

We will need two major inputs from algebra: 


(a) The fundamental theorem of algebra for real polynomials. The polynomial Q 
has a factorisation: 


OR) = 6 tay i hal Pee Bi) ae eB)” 


where the real numbers Aj, ..., A, are distinct, the real number pairs (a, 6}), ..., 
(a@,, By) are distinct, the exponents, r), ..., 7m, 51, ---» Sn, are positive integers, and 
the second-degree factors have no real roots (they are, in other words, irreducible 
over the reals). The factors x — A; and x? +a jx + B; are called the prime factors 
of Q. A prime factor is said to be simple if its exponent in the factorisation is 1. 
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First-degree polynomials and second-degree polynomials that do not factorise over 
the reals are the prime elements in the ring of real polynomials in one variable. They 
play a role in polynomial theory very similar to that of the prime integers in number 
theory. 


(b) The rational function P (x)/ Q(x) has a partial fractions decomposition; see the 
next section. 


Exercise Show that }°'_, rj + 2 )"\_, sj = deg Q (the degree of the polynomial Q). 


8.2.1 Partial Fractions 


The partial fractions decomposition of P(x)/Q(x), given that the degree of Q is 
higher than the degree of P, and that Q has the factorisation as outlined above, is 


P(x) aij a vm Qnj 
OG) Set “42 eae 


byjx+ej; bnjX + Cnj 
+) ——4 + +--+)  —  *. 
eB (x? + ox + Bi)/ 2, (x? + nx + Bn)/ 


It is guaranteed that the coefficients a;;,(1 <i<m,1< j <7;), bjj andcj,1 < 
i <n,1 <j <:s;),can be found uniquely by solving simultaneous linear equations. 
It is only necessary that the form of the fractions is correctly set down. A foolproof 
method to find the coefficients is to clear the denominators by multiplying through 
by Q(x) and then equate coefficients of like powers of x. 


8.2.2 Practicalities 


We look at some practical hints for finding the partial fractions decomposition of 
P(x)/Q(x), on the assumption that the real factorisation of Q is already known. 
Remember that one can always express the problem as a system of linear equations, 
that is guaranteed a unique set of solutions when the form of the fractions is correct. 
However, some of the tricks shown below can save labour. 


Case A. The prime factors are simple and of first degree. This is the case 
Q(x) = (X — Aq )(X — Ag)... — Am) 


where the numbers i, ..., A, are real and distinct. 
An example will illustrate a convenient method. Find a, b and c, such that 
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x a ie b nm Cc 
G=DE—NE=+3) eal Ha? “Zs 


Clearing the denominators we find 


x=a(x —2)(x —3)+ d(x — Dw — 3) +c — I(x — 2). 


Substituting the values x = 1, 2,3 in turn, we get a = 5, b=-2,c= 3. 


Case B. The prime factors are of first degree but are not all simple. This is the case 


O(x) = (@ = AD). = An) 


where some exponents are greater than 1. 
Again we give an illustrative example. Find a, b, c and d, such that 


1 a b Cc d 


Gove! Sai Gai 40" Game’ 


Clearing denominators we find 
l=a@= DG =27 +456 2 FoR 17 eS ede Dt 


The substitutions x = | and x = 2 give b = 1 andd = 1. Next we differentiate and 
obtain 


0 = a((x — 2)? + 2(x — 1)(« — 2)) +2 — 2)° 
+ (2 — Di — 2) + (*— 1?) +20 - D. 
Now putting x = | and x = 2 gives a = —2 andc = —2. 
An alternative to differentiating in the second step is to write (having already 


determined that b = 1 andd = 1) 


1627" = 6= 1S a= DG 27 Hee = 1° = 2: 


Now it will be found that x — 1 and x — 2 divide the left-hand side and can be 
cancelled. A further substitution of 1, followed by 2, then reveals a and c. 


Case C. Some prime factors are irreducible quadratics but all exponents are 1. This 
is the case 


O(x) = (x — Aq)... (% — Am) (x? 4 yx + By)... (x? + yx + Bn). 


Consider the example 
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x? _ oa pee 
(x—Da24+1) x-1) x?41° 


Clearing denominators we find 
x? =a(x? +1)4+ (bx +c)(x — 1). 


Substituting x = | givesa = ‘. Now we write 
2_1,5 
x“ — 5 +1) = (bx +c)(x —- 1). 
Now x — | divides the left-hand side and cancelling it we find 
1 
he +1)=bx+c 


givingb=c= 5. 
Alternatively, if you know about complex numbers, you can substitute one of the 
complex roots of x” + 1 (they are i and —i), to find b and c. 


8.2.3 Outline of Proof 


The existence of the partial fractions decomposition can be proved by linear algebra. 
We shall sketch a proof, accessible for the reader familiar with vector spaces, in 
particular the notions of basis and linear independence. The idea is that the partial 
fractions decomposition is merely a change of basis in a finite-dimensional vector 
space over the real field. The reader unfamiliar with these ideas can simply skip this 
section. 

For a given positive integer d, the set of all polynomials with degree less than d 
is vector space over the real field, and it has finite dimension d. In fact, a basis for it 
is provided by the set 


{1, x, x?, seasrage 


comprising d polynomials. 

Our real interest is rational functions. Let Q be a monic polynomial with degree d. 
We shall denote by Rg the space of all rational functions, expressible in the form 
P(x)/Q(x) for some polynomial P with degree less than d. It is a vector space of 
dimension d. As a basis for Rg we can indicate the set 


| 1 x x? xo! 
O(x)’ O(x)’ A(x)’ Aw) J- 
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The polynomial Q has a factorisation as described in item (a) above (by the fun- 
damental theorem of algebra). Deploying the constants A), ..., Am, the pairs (a1, 61), 
wey (Qn, By) and the exponents 7), ..., 7m, 51, +» Sn, We consider the functions in the 
ensuing list: 

1 


———_—_,, l<p<n, 1l<k< ; 
Goa)? (<p<re <k<m) 


Xx 


@ fom Eh ds<q<s, 1<k<n), 


1 


G2 laas foe (leqsum, 1sk<n). 


The functions in the list all belong to the vector space Rg, as they can obviously 
be expressed with Q as denominator and with a numerator of degree less than d. The 
number of functions in the list (the reader is invited to count them) is the same as the 
degree of Q, that is, it is the same as the dimension of Rg. We can conclude that they 
form another basis for Rg, provided they can be shown to be linearly independent. 
We omit this step, which is most easily accomplished by exploiting the roots of Q, 
including the complex roots of the irreducible quadratic factors. Complex numbers 
will be considered in Chap. 9. 

Having shown that the functions in the list constitute a basis for Rg, it follows that 
every element of Rg can be expressed in a unique fashion as a linear combination of 
them. This is, of course, just the partial fractions decomposition. 


Exercise To get an idea of how an independence proof might proceed, the reader 
can try to prove that the six functions 


1 1 1 x 1 x 
x—1? w—-1)?? x2 44417 x2? 4x41) 24441)’ G2 +241)? 


are linearly independent. This is the case Q(x) = (x — 1)2(x27 ++x%4+ 1). 
Hint. One has to show that if a relation 


ay 4. a2 + b, + box b3 + b4x = 
x—-1l) (xe—-1?) x? 4x41 (2 4x412 — 


holds (for all x), then the coefficients a), ao, by, bz, b3 and bg are all 0. One way to 
start is to multiply by (x — 1)? and set x = 1. 
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8.2.4 How to Integrate the Fractions 


The problem of integrating P(x)/Q(x) reduces to integrating each term in the partial 
fractions decomposition. 
Two of the fractions are easily dealt with: 


/ : dx =\n(x —A) (orln( — x) if x < A) 
x—-k 


and 


I 1 
[—e- (l—r)(x —aAyr-l?’ (r #1). 


Next we write 


bx +c 2 “BORsa) 5 (2c — ba) 
G2+ax+ py (G2+axt+py O?+ax+ py 


The first fraction on the right is treated to the substitution u = x? + ax + B, which 
leads, in the case s 4 1, to 


i 2x +a = 1 _ 1 
@+axt+py  (—su! 0 —s)\@? Fax + py’ 


and, in the case s = 1, to 


2x +a 9 
igre = ee +ax-+ B). 


There remains the more difficult task of solving the integral 


/ sg 
(x2 + ax + B)S 


First we write 


a2 


4 +p =( +4) 46 
X ax = |x 2 4 


This is the familiar operation of completing the square. Set u = x + (@/2). The 
constant 6 — (a@7/4) is positive (because the polynomial has no real root) and we set 
y* = B — (a’/4). Now we have the integral 


I 
I= | —=— du. 
/ (wu? + y?)§ 


Using integration by parts we find 
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~ Ww rere 6 f oon TE yA ® 


id 2 
— @+y)° + 2sI; = 2sy Ts41 


and collecting multiples of J; we obtain 


Uu 2s—1, 
2, 2 + 2 
2sy7(u2+y7) — Qsy 


Isa. = 


This is an example of a reduction formula. It is convenient for purposes of calculation 
to knock | off s and write it as 


u 2s —3 


T,-1, = 2,3, 4.55; 
16 -D)ywtyy wpe * 


I, = 


After a finite number of applications we come down to the integral 


1 1 u 1 x+a 
h= 5 3 du = — arctan — = — arctan . 
unr y Y YY: Y 
We summarise the conclusions concerning the form of the antiderivatives just 
derived. 


Proposition 8.3. The antiderivative of a rational function P(x)/Q(x) can be 
expressed as the sum of three functions G\(x) + Go(x) + G3(x) (some of which 
may be 0), where 


(1) G,(x) is a rational function. 

(2) G2(x) is a linear combination, with real coefficients, of logarithms of first or 
second-degree polynomials. 

(3) G3(x) is a linear combination, with real coefficients, of arctangents of first- 
degree polynomials. 


We often distinguish the different parts of the integral by calling G, (x) the rational 
part and G2(x) + G3(x) the transcendental part. 


8.2.5 Integrating Rational Functions of sin @ and cos 0 
We are going to solve the integral 


[ R00s 6, sin@) dé (8.2) 
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where R(x, y) = f(x, y)/g(x, y) is arational function of x and y, thatis, the quotient 
of two polynomials in the two variables x and y. This can always be accomplished 
by the half-angle substitution t = tan(@/2). Its efficacy can be explained by the fact 
that the circle x* + y* = 1 has the rational parametrisation: 


1-?r or 
x =—,, = —_,: 
1+ 72 - 1+? 


Proposition 8.4 The substitution t = tan(@/2) transforms the integral (8.2) into the 
integral of a rational function of t. 


Proof The perfectly explicit proof is the series of elementary calculations: 


; _8¢ 2 sin £ cos $ 2t 
sin@ = 2sin = cos = = 7 ay ae 5 
2 2. 05%. ames ~ 1st 
50 . 79 cos*4 — sin? 1—7? 
cos 0 = cos sin“ = = i aap = 5 
2 2 cos*e+sin?> +t 
1 cos*% + sin? 1+7 
dt = ——— dd = —~—_ d= do 
2 cos?5 2cos?5 2 


all of which leads to 


[x 6, sin@) do [rR Ss aan? 
cos sin = F 
, 1477°142/)7 142 


8.2.6 Further Useful Reduction Formulas 


We list some more reduction formulas that the reader is likely to encounter. They all 
come in useful for finding antiderivatives. Yet others are explored in the exercises. 


(i) fo x)" dx =x(Inx)" —n fo x)" dx 
(ii) [ve dx =x"e — nf xe dx 
ses on Ls n-1 n—1 + n—2 
(ili) sin"x dx = —— sin" x cos x + —— } sin" “x dx 
n n 


: 1 . n—1 
(iv) [oss dx = —cos"—!x sinx + —— | cos"-*x dx. 
n n 
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The last two rules lead to Wallis’ integrals: 


n—-Iln—-3 We ye 
7 t ee if nis even 


2 2 = 
/ sin” x dx = cos"x dx = a 1 iH : : 2 
9 eae 5 if'nis odd 


n n—-2- 


8.2.7. Exercises 


In the case of a rational integrand the precise form of the antiderivative may depend 
on which interval, whose endpoints are successive zeros of the denominator, is being 
considered. If this is the case it is simplest to assume that x is higher than all roots 
of the denominator. It is then easy to adjust the antiderivative thus obtained for other 
intervals separated by roots of the denominator. 


1. Solve the following integrals: 


(a) 1 ae 
a —— ax 
x2+1 


(b) wonea 
© | wanes” 
(d) DEFT dx 
() GWG ED dx 
( 


=e 


2. (a) Evaluate the integral 
[ x41 _ x)* 
—.~—— dx 
0) x? + 1 


(b) Using the estimate 5 < (x*+1)7! < 1 obtain the inequalities 


22 1 22 1 
SS Se SS SS 
7 ~~ 630 7 1260 


Note. The approximation 22/7 to z was known to Archimedes. The result shows that 1979 /630, 
or 3.14126.., is an approximation to z with error less than 1/1000. The actual error is around 


3/10000. 
3. Solve the integral 
1 
———. dé 
; a+sin0@ 


280 


10. 


11. 


12. 


13. 
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where a is a non-zero constant. Distinguish carefully between the cases a” < 1, 
a’ = 1anda’ > 1. 


. Prove the reduction formulas listed in this section. 
. Verify the formulas known as Wallis’ integrals. 
. Obtain a reduction formula for / sec"x dx. This is useful because integrals 


involving powers of secant often arise from using the substitution x = tant in 
integrals involving the factor V1 + x?. 


. To exploit the result of the previous exercise one needs the useful integral 


J sec x dx. This was considered in Sect. 8.1 Exercise 7. Solve it again by using the 
half-angle substitution t = tan(x/2). Reconcile the result with that of Sect. 8.1 
Exercise 7. 


. Using the previous two exercises and the substitution x = tan t solve the integral 


fa + x7)3/? dx. 


. Obtain a reduction formula for rf x"e% dx, The results should show that this 


integral can be expressed by elementary functions when n is odd, including the 
case of negative n. 

Show that the integral { sin” x cos"x dx, where m and n are non-negative inte- 
gers, reduces to one of the following easily solvable integrals: 


J Fovins) cos xx, [ F0cossysinx dx, [ foinsyax 


where f is a polynomial. 
Obtain reduction formulas for the integrals 


(a) / tan” x dx 
(b) / tan” x sec x dx 
x 
0 [ame 
Some integrals involving fractional powers can be reduced to the integral of a 
rational function by a suitable substitution, and the resulting integral solved by 


the methods of this chapter. Try to do this (or at least the reduction part) for the 
following integrals: 


1 
© faa 
(b) [vans dx 


i x d 
() hia 7 
1 


(d) 


dx. 
(x — a)3/2 + (x + a)3/? . 


Solve the following integrals: 
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1 

(a) i ol dx 
1 

(c) [me 


Hint. Try to find the irreducible quadratic factors using inspired guesswork. 
Complex numbers can help in factorising the denominators; see the next chapter. 


8.3 (©) Ostrogradski’s Method 


In order to find the partial fractions decomposition of a rational function P/Q, prior 
to integrating it, one must know the roots of the denominator. This may be a hard 
task. In this nugget we shall explore what can be discovered about the integral f P/Q 
without finding the roots of the denominator. In particular it turns out that the rational 
part of the integral, see Proposition 8.3, may be found by linear algebra alone. This is 
variously known as Ostrogradski’s method or Hermite’s method (the former seems 
to have priority but the latter is better known). Some knowledge of algebra will be 
assumed. 

We recall the fundamental theorem of algebra, according to which a real non- 
constant polynomial Q has a factorisation into real prime factors 


ky ko ke 
Q = 81'85'---8,'- 


The prime factors g;, all distinct, are first-degree polynomials, or second-degree 
polynomials without real roots. The exponents are positive integers. Recall that a 
prime factor g; of Q is called simple if k; = 1. 

The polynomial Q is called square-free if all the exponents k; equal 1. The usage 
is the same as for the factorisation of integers into primes. Square-free indicates that 
it is not divisible by any perfect square (in which context constant polynomials do 
not count). 

Prime polynomials (for our purposes this means first-degree polynomials or 
second-degree polynomials without real roots) have the following important prop- 
erty: if P and Q are polynomials and a prime polynomial g divides PQ, then g 
divides either P or Q. 

The role that the following, rather mysterious, lemma plays will be apparent 
shortly. The reader may prefer to look ahead and see how the lemma is used before 
reading its proof. 


Lemma 8.1 Suppose that P;, P2, Q; and Q2 are polynomials that satisfy the con- 
ditions: 


ie aes 
QO; Qo 
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(ii) All prime factors common to both Q, and Q2 are simple factors of Qo. 
Then Q, divides P, and Q> divides P. 


Proof Our strategy for proving this is to show that all prime factors of the denomi- 
nators Q; and Q> can be cancelled against the numerators P; and P, using a process 
of descent (which is really the same as induction). There are two possible cases in 
each step of the descent. 


(A) The polynomial g is a prime factor of Q» but not of Q. 


In this case we show that g divides P). By condition (i) we have 


Q1Q2P; — O1.02P: + PQ} =0. (8.3) 


Since g divides Q> it must divide P Qt also. But g does not divide Q,, and therefore 
g divides P. 


(B) The polynomial g is a prime factor of Q). 


In this case we show that g divides P;. To see this we first write Q; = g’W and 
Qo = g*, where w; and yy are polynomials not divisible by g, the exponent r 
is a positive integer, and, according to assumption (ii), the exponent s is 0 or lL. 
Substituting into (8.3) and cancelling powers of g we find 


eWitnP,| —re'WiweP — eho Pit+e9" tlw =0. 


We see that r > 1 andr —s+1> 1, so that g divides g'W,WP,. But g does not 
divide g’, nor by assumption does it divide y; or Wo. It follows that g divides P;. 
In case A we replace P2 by P2/g and Q2 by Q2/g. In case B we replace P; by 
P,/g and Q, by Q,/g. We proceed to eliminate all prime factors of Q, and Qo, at 
each step applying case A or B as appropriate. A simple way to implement this is to 
apply case B until all prime factors of Q; have been stripped, then to apply case A to 
do the same to Qo. It is clear that this process eventually clears both denominators 
of all their prime factors, and leads to the conclusion that Q; divides P; and Q2 
divides P>. 


Suppose as before that Q has the prime factorisation 


Q = gi'a7"..87'. (8.4) 
We define the polynomials 
Or = gy 'gy Be On = 8182..-Be- (8.5) 


Exercise Show that the polynomial Q, is the highest common factor of Q and Q’, 
and that Q2 = Q/Q,. Show also that Q; and Q> satisfy condition (11) of lemma 8.1. 


We will also need the formula 
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Ow — Vegi 
—= ———_., 8.6 
OQ; Z Sj ee) 


j=l 


Exercise Prove this formula. 


We will need the following input from algebra. The highest common factor of 
two polynomials can be computed without factorising them, using the Euclidean 
algorithm (this was also referred to in the nugget “Multiplicity”). The reader unfa- 
miliar with this should consult a book of algebra. The significance here is that the 
polynomials Q; and Q> can be found without factorising Q, since Q, is the highest 
common factor of Q and Q’. 

The following decomposition of P/Q is the basis of Ostrogradski’s method. 


Lemma 8.2 Suppose that deg P < deg Q and let Q; and Q> be defined by (8.4) and 
(8.5). Then there exist unique polynomials P, and P2, such that deg P; < deg Q, and 


deg P2 < deg Qo, satisfying 
P Pi\' Py 
= + : (8.7) 
Q Qi Q> 


Moreover P, and P2 can be found by linear algebra, more precisely by the method 
of undetermined coefficients. 


Outline of Proof We first show that if deg P; < deg Q; and deg P) < deg Qo, and 


if ; 
(FZ) += 
Q) Q> 
then P; = P) = 0. This follows from lemma 8.1, according to which we can conclude 
that Q, divides P; and Q> divides P,. Since deg P; < deg Q; and deg P, < deg Q> 
it follows that P; and P, are both 0. 

The rest of the argument is linear algebra. We define two vector spaces. Firstly we 
need the vector space Rg of all rational functions expressible in the form P(x)/ Q(x) 
with deg P < deg Q (also used in the discussion of partial fractions in Sect. 8.2). Sec- 
ondly we need the vector space W comprising all pairs of polynomials (P;, P2), such 
that deg P; < deg Q; anddeg P, < deg Q>. The first space has dimension deg Q and 
the second deg Q; + deg Qz, but these are equal since OQ = Q;Q>. 

We introduce a linear mapping T : W — Rg defined by 


T (Pi, Pr) (2) +2 
£2) = | a. 
QO; Q> 

That the right-hand side really belongs to the vector space Rg is easily seen by 
expanding the derivative and using (8.6). The reader should check this point. 


The argument at the beginning of the proof and based on the lemma tells us that 
the kernel of T contains only the zero vector in W. Hence T is injective. Since 
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the domain and codomain of T are vector spaces of the same dimension, it follows 
by linear algebra that it is also surjective, and hence bijective. This establishes the 
required decomposition. 


As a corollary of the decomposition (8.7) we obtain 


i P Py a) 
= + . 
Q QQ Q2 
The integral on the right-hand side is a transcendental function since the denominator 


Q> is square-free and the numerator has lower degree than the denominator. The first 
term on the right-hand side P; /Q, is the rational part of the integral. 


8.3.1 Exercises 


1. Let w be a square-free polynomial of degree d and let P have degree less than 2d 
and no prime factor in common with w. Show that the integral { P/ Wis rational 
if and only if w divides P’w’ — Py”. 

. 4x3 — 3x7 —2 . . : 

2. Show that the integral | ——————~— dx is rational and evaluate it. 

(x3 + 1)? 


1 
3. Solve the integral : Ga? dx. 


8.3.2 Pointers to Further Study 


— Symbolic integration 


8.4 (©) Numerical Integration 


If an antiderivative for f is not forthcoming, it is often the case that to calculate the 
integral f : f one has to approximate it. 

We have seen how Riemann sums furnish an approximation, but one that yields 
few extra decimal digits with each improvement step. More exact methods exist 
that build on the same idea. Points t,; are chosen in the interval [a, b] and the sum 
> ax f(t) is formed with coefficients a, that satisfy )* a, = b —a. There are a 
number of different prescriptions that have been developed. They are called numer- 
ical integration rules. By choosing the points and the coefficients appropriately it is 
possible to reach some remarkably accurate approximations. 
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8.4.1 Trapezium Rule! 
The interval [a, b] is divided into n equal parts. Let 


h= tea teeth bH12,4.4=D,. Sb. 


The trapezium approximation is 


k=1 


h n—1 
Tn = 5 (1 +2>° fu) + rio) (8.8) 


It is possible to bound the error. Assume that f is twice differentiable and that 
|f"(x)| < M in [a, b]. Then 


b 
[ton 


A convenient approach is to double n in each improvement step, as then previous 
points can be used again. We can roughly say that each improvement step yields on 
average at least 0.6 (being near to log,, 4) extra correct digits after the decimal point, 
as opposed to 0.3 digits for Riemann sums. 


M(b —a)3 


ne 


8.4.2 Midpoint Rule 


The interval [a, b] is divided into n equal parts. Let 


h= , Bea SOE Hi. n=, & Sb. 


The midpoint rule is the approximation 
n—-1 de tug 
k + kt 
M, =h y f et) : (8.9) 
k=0 ( 2 


This is just a Riemann sum where in each subinterval we choose to evaluate f at the 
midpoint. 


‘Also known as the trapezoidal rule. 


286 8 The Techniques of Integration 


8.4.3 Simpson’s Rule 


The interval is partitioned into 2n equal subintervals. Set 


b- 
= =e WS, heatkh, b= 1,2,.,28- 1, tab. 
n 


Simpson’s rule is the approximation 


n-1 
= 4 (Ho 44¥ few 0425" Fa) +/0). (8.10) 


k=1 k=1 


The error is much improved over that for the trapezium rule. Assume that f is 
four times differentiable and that | f (x)| < M in the interval [a, b]. Then 


[7-8] 


Each doubling of n contributes, at a rough estimate, a further 1.2 (near to log;, 16) 
correct decimal digits. Note also that the rule is exact for third-degree polynomials, 
as is clear from the error estimate. 

The rule is easy to remember in the form: 


M(b—a)> 
~ 2880n4 — 


b 
h 
/ fx a [initial + twice even + four times odd + final |. 


8.4.4 Proof of the Error Estimate 


We will prove the error estimate for Simpson’s rule in some detail. Error estimates 
for the trapezium rule and the midpoint rule are simpler and are left to the reader as 
exercises. 

The interval is partitioned into an even number of intervals so we first estimate 
the error for an interval partitioned into two subintervals. It is convenient to take the 
interval as [—c, c], with partition points 


to=-—c, tt =0, h=c, 
so that h = c. Simpson’s approximation is 

c 

5(f(-2) +4 + f). 


We set 
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em= | f-F(f-n+4s+F0), O10. 


Now we differentiate repeatedly with respect to t. The reader should check the algebra 
in the following steps: 


Ey =2 (+2 (-t) : (0) : ‘(-1) ; ‘(t) 
=3f af 3f +3f 3f 
Ww eat * 1 / t vt t vt 
E'(t) = 3h 3 fC t) af ( t) af (t) 
t 


E(t) = ay eo) .< sf"), 


Recall now that we are assuming that | f (x)| < M for all x. By the mean value 
theorem we therefore have 


f(s) = f"Cs)| < IMs 


for s > 0, so that 


2M 2 m 2M 2 
eget <E (SAN 8, (s > 0). 


Integrate these inequalities repeatedly from 0 to fr; the correct values for the middle 
term at tf = 0 can be read from the derivatives calculated above. After the third 
integration we find 


M M 
—-—¢t <E(t)<—?r, 
90 90 


which gives 
IE@l< ae 
<—c. 
c 90° 


We apply this to the partition of [a,b] into 2n intervals of length h. We have 
h = (b — a)/2n and the error for each pair of consecutive subintervals is bounded 
by 
M_;  M(b-a) | M(b—a)? 
90° = (90x 25)n5_——2880n5 


Adding together over the n pairs of intervals we get the error bound 


M(b—a)> 
2880n4 


for Simpson’s rule. 
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8.4.5 Exercises 


1. Show that Simpson’s approximation for the two interval partition is the integral 
from a to b of the second-degree polynomial whose graph (a parabola) passes 
through the three points 


+b a+b 
(a, f@), (4 fl )). (b, f)). 


This makes it obvious that Simpson’s approximation is exact for a parabola, but 
does not reveal why it should be exact for a cubic. 
Hint. It’s easy to do it for the case a = —b. 

2. Prove the error estimate for the trapezium rule. 

Hint. Do it first for a partition with only one subinterval. You can take the 
interval as [—c, c] so the trapezium approximation is c(f(—c) + f(c)) and h = 
2c. Estimate the error by copying the method used above for Simpson’s rule; only 
it’s much easier. 

3. Develop an error estimate for the midpoint rule. Compare it with the error in the 
trapezium rule. Which is more accurate? 

4. There is another way to write the error that sometimes gives more information, 
for example it can tell you whether the approximation lies above or below the 
true value. The treatment suggested here, and in the next two exercises, is closely 
related to Taylor’s theorem (to be studied in Chap. 11). 

For the trapezium approximation show that if f is twice differentiable, the error 
(the integral minus the approximation) is 


= 


— ” re. 


for some & in Ja, b[. This gives the intuitive result that for a convex function the 
integral is below the approximation. 
Hint. Do it first for one interval, letting the interval be [—c, c] and h = 2c. Let 


E@)= | f-x(fCx)+ FO), 


so that the error is E(c). Apply Cauchy’s form of the mean value theorem repeat- 
edly to the quotient E(x)/x*. You will need the fact that derivatives have the 
intermediate value property (see Sect. 5.6 Exercise 8). 

5. Develop a similar result for the midpoint rule and compare it with the trapezium 
rule as regards magnitude and sign. 

6. A similar result is also available for Simpson’s rule, but it is more complicated. 
If f is four times differentiable then 
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(b—a)y 


~ 3880nt 7 Oe) 


b 
i jf — Simpson’s approx. = 


for some & in Ja, b[. A proof is outlined in the following steps: 


(a) Obtain the result in the case of a Simpson approximation with two subinter- 
vals. 
Hint. You can take the interval [—c, c] subdivided at 0, so that h = c. Let 


E(x) = i; fs 5 (fCx) +4 (0) + f(x). 


Apply Cauchy’s form of the mean value theorem repeatedly to the quotient 
E(x)/x>. You will again need the fact that derivatives satisfy the intermediate 
value property. 

(b) Obtain the result for 2n subintervals. Again you will need the intermediate 
value property. 


7. Let I, denote the approximation to f : f obtained from applying Simpson’s rule 
with 2n subintervals. Define a new approximation 


_ 16lon — Tn 


In 
15 


Find a formula for the error i ° Ff — Jn. The result should indicate that J, is a 
slight improvement over Simpson’s rule. 


8.4.6 Pointers to Further Study 


— Numerical integration 
— Gauss quadrature 
— Numerical analysis 


Chapter 9 M®) 
Complex Numbers cies 


The shortest path between two truths in the real domain passes 
through the complex domain 


J. Hadamard 


Negative real numbers have no square root. This apparently regrettable fact is alle- 
viated by extending the real number field R to the complex number field C. In it all 
numbers have a square root. 


9.1 The Complex Number Field 


Formally the complex number field is a field C, that is, a set of elements satisfying 
axioms Al—A6 of Chap. 1, that includes the real numbers, contains an element /, 
necessarily non-real, that satisfies i 2 — —], and contains no other non-real elements 
except those that it has to in virtue of being a field. Its non-real elements must include 
all the elements of the form a + bi, where a and b are real numbers, but plainly we do 
not have to include especially elements such as 1 + 2i + 37? + 4i? + 529, involving 
second and higher powers of i, since they can be expressed in the form a + bi by 
using the property i7 = —1, and therefore are already there. 

Interestingly we do not have to include especially the reciprocals of elements 
(although fields must contain reciprocals of all non-zero elements). They are express- 
ible in the form a + bi and are already included. The reciprocal of the complex 
number z = a + bi, assuming that z ¥ 0, is given by 


“1 a Be 3 
Ls i, 
av+bh a+b? 
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as the reader should check. 
The important identity 


(a+ bi)(a — bi) =a’? +b? 


is used very frequently. As a simple instance suppose that a + bi = 0 (where a and 
b are real numbers). We deduce 


0=(a+bi)(a — bi) =a* +b’, 


so that a = b = 0. This implies that the expression of a given complex number in 
the form a + bi, with real a and b, is unique. This allows us to define the functions 


Rez=x, Imz=y, (z=x+ yi, x, y real) 


called the real part of z and the imaginary part of z. 

We often give ourselves a simple geometric picture of the complex numbers. Just 
as we think of the real numbers as a line, we think of the complex numbers as forming 
a plane, by identifying the number z = x + yi with the point (x, y) of the coordinate 
plane. In this way the real numbers are identified with the x-axis, called the real axis, 
and the complex numbers of the form yi with the y-axis, called the imaginary axis. 
We sometimes identify z with the vector joining (0, 0) to (x, y), instead of simply 
with the point (x, y). These identifications sometimes give a way of proving things 
in Euclidean geometry using algebraic operations on complex numbers. 

Extending a field by joining a new element to serve as a root of some equation is 
one of the most elementary operations of field theory. For example we can start with 
the field Q of rational numbers. We saw that the equation x” — 2 = 0 has no root in 
Q since there can be no rational square root of 2. Let us join a root, denoted by V2, 
to Q. We obtain a new field, denoted by Q(./2), consisting of all elements of the 
form a + b/2 with rational a and b. Again we can simplify an expression such as 
1+ 2/2 + 3(/2)> to the form a + b,/2. And the reciprocal of a + bV2 is given 


by 
a b/2 
a—2b? a2—2b?' 


(a+ bV2)1 = 


Note how the denominators fail to be 0; because the square root of 2 is not rational. 

This procedure gives us another way to enlarge the field of rational numbers to 
include the square root of two, quite different from that of Chap. 1. And it seems 
neither more nor less acceptable than conjuring up an element i to serve as a square 
root of —1. But there is a big problem. The field Q(./2) takes us only part of the way 
from Q to R. There is much still missing; for example there is no square root of 3. 
We can join it, and then one by one, all square roots that are not so far expressible, 
then all cube roots and so on. After that we can join roots of polynomials that are 
still stubbornly unfactorisable. We obtain a whole sequence of fields intermediate 
between Q and R. However we can never reach R by this piecemeal means; there 


9.1 The Complex Number Field 293 


are some irrationals that are not even expressible as a root of a polynomial equation 
with rational coefficients. These are the transcendental numbers. They include e and 
x for example. 

So we have to arrive at R by a big leap, accomplished by means of the axiom 
of completeness. From R to C is but a small further step, joining a non-real root of 
x? + 1 = 0. But the remarkable thing is that the process ends here. Every polynomial 
equation with coefficients in C has a root in C; no further extensions are needed. 

The fact just mentioned is called the fundamental theorem of algebra and it was 
mentioned in the previous chapter in connection with the partial fractions decompo- 
sition of a rational function, though without mentioning complex numbers. The fact 
stated there is a simple corollary of the fundamental theorem of algebra. 


9.1.1 Square Roots 


We will take a little detour to examine the claim made before that every complex 
number has a square root. As a corollary we solve any quadratic equation. We will 
use purely algebraic arguments. Later we will examine radicals (cube roots, fourth 
roots, etc.) by different methods. 

Suppose we are given the complex number a + bi and we wish to find its square 
root x + yi. We assume that x, y, a and b are real and in future will often not make 
such assumptions explicit. If x + yi is a square root of a + bi then 


(x + yi)? =x? —y* + 2xyi =a + bi, 


and so 


xy =a, 2xy =b. 


Using the identity (x? + y?)? = (x? — y?)? + 4x?y? we find that 


ety =VJ/a2+ bd. 
Solving for x* we find 


» atva?+bh 


i) aoa 
2 


and since a + Ja? + b2 > 0 we obtain 


la + Ja? + b2 
x==zr a 
2 


For each of the two values of x we find y by the relation y = b/2x, except that when 
a=b=Owetakex=y=0. 
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We thus get two square roots (except for when a = b = 0) given by 


a+VJa?+b? =/ a+VJa?+b* 

+ 
2 2 2 

and its negative. 

It will be rightly objected that we have not proved the existence of any square 
root, merely shown that if a square root exists it must be given by this formula. The 
reader is therefore invited to check that this is indeed the square root of a + bi by 
squaring it. 

Now we have the square root we can solve any quadratic equation, using the usual 
formula. 


Proposition 9.1 The quadratic equation ax* + bx + c = 0, with coefficients a, b, c 

in C and with a 4 0, has a root in C. Its roots are found by the usual formula. More 

precisely, let d = b* — 4ac and let w be a square root of d. Then the roots are 
1 

=-(—-b+o). 

2a 


The reader is invited to check that these are the roots. They are the only ones; this 
can be seen by the usual method of completing the square, based on observing that 
the equation ax” + bx + c = 0 is equivalent to 


BN > ad 
(«+3) = 40 4ac). 


We can also appeal to a simple fact of field theory, that a polynomial with degree n 
cannot have more than n roots. 


9.1.2 Modulus and Conjugate 


Let z be acomplex number and let z = x + yi, where x and y are real numbers. The 
number 


Zi=xX-— yl 


is called the complex conjugate of z, or just conjugate for short. The real number 


izl = Vx? + y? 


is called the modulus of z, but is often called its absolute value. It has a geometric 
interpretation. If you think of z as the coordinate vector (x, y) (if you like, the vector 
joining (0, 0) to the point (x, y)), then |z| is its Euclidean length. 
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The real part Re z = x and the imaginary part Im z = y have already been men- 
tioned. Here is a list of important properties of these operations. 


(7% 2) zZFWw=Z4+W 
3) Rec=}@+Z @ Ime=7-2) 
(5) 2w=Zw (6) zgi=z! 

(aM eae od (8) |zw| = IzI |wI 
(9) Rez <|z| (10) Imz <|zI. 


Some indications of the proofs. Rules 1-6 are obvious consequences of the defi- 
nitions. Rule 7 is the identity x? + y* = (x + yi)(x — iy). Rule 8 follows from rule 
7 using the calculation 


|zw| = ZWZW = ZZWW = Iz|-|wP’. 


Rule 9 (and similarly rule 10) is the inequality x < /x?+ y?. 
The following result (more precisely the first inequality) is called the triangle 
inequality and is immensely important. 


Proposition 9.2 For all complex z and w we have 


(1) |z+wl < |z| + |w| (triangle inequality) 
(2) |Izl— wl] < Iz — wl. 


Proof We calculate, applying at least seven of the rules listed above: 


z+ wl? = (c+wy(Z+W) 
= 727+ zw +wz+ ww = |z|* + 2Re(zw) + |wl? 
< |zl? + 2[zw| + |wl? = |z|? + 2iz|lw] + |W? 
= (\z| + |wI)?. 


This proves the inequality in item 1. 
Now z= z—w-+w, so by item 1 we find |z| < |z — w| + |wl, or 


Iz| —|w| < lz — wl. 
Interchange z and w. This makes no difference to the right-hand side. We find 


[w| — |z| = lz —wI. 


In one of the displayed inequalities the left-hand side is lIzI —|w| |. 


We see from the calculation proving the triangle inequality that equality holds 
in it if and only if Re (zw) = |zw|, which is equivalent to saying Im (zw) = 0. If 
zZ=a+bi andw =c+di then 
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= ba 
Im (zw) = bc —ad = i “| 
so the equality Re (zw) = |zw| says that the vectors (a, b) and (c, d) are linearly 
dependent. Putting it differently it says that equality holds in the triangle inequality 
if and only if either one or both of z and w are 0, or there is a non-zero real number 
4, such that z = Aw. 

There is a more geometric interpretation of Im (zw) that derives from its determi- 
nantal form. Its absolute value is the area of the parallelogram with vertices 0, z, w 
and z + w, or twice the area of the triangle with vertices 0, z and w. Even Re (zw) 
has a geometric interpretation. It is the scalar product of the vectors z and w. 


9.1.3 Exercises 


1. Check the claim that the reciprocal of a + bi is (a — bi)/(a? +b’). 

2. Check that Re (zw) is the scalar product of the vectors z and w, where the complex 
numbers are identified with plane vectors. 

3. Check that the formula 


oes / at+ Ja? + b2 
x= 
\ 2 2 2 


really does provide a square root of a + bi. 
4. Show that the area of the triangle with vertices z;, z2 and z3 is the absolute value 
of 


1 
an (Z1Z2 + Z2Z3 + 2321). 


5. Prove the inequality 
|z| < V2 max(|Re z|, |Imz]). 


Show that ./2 cannot be replaced by a smaller number. 
6. Let 
F(X) = ax” + ap_ix" | +--+ + ax +49 
be a polynomial with real coefficients. Show that for all complex z we have 


TQ =F@): 


Deduce that if w is a complex root of f(z) = 0 so is w. 
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9.2 Algebra in the Complex Plane 


The identification of C with the plane produces a valuable geometric picture of the 
algebra of the complex numbers. We can, for example, introduce polar coordinates 
in the plane, 


x=rcosé, y=rsind, (r>0, —w~ <4 <o) 


(we allow here the cases r = 0 and do not restrict @ to any particular interval) and 
then write 
x+ yi =r(cosé +isin6). 


Here we have r = |x + yi|, but 6 (called the argument of z) is not uniquely deter- 
mined by z, as we may always add to it an integer multiple of 277. The point 0 is an 
exception, as it does not have an argument. 

If we agree to choose @ in the interval ]—7r, zr] then it is uniquely determined by 
z (if z € 0). It is then called the principal argument and is denoted by Arg z (always 
with upper case “A” as befits its privileged status). The principal argument of z is the 
polar angular coordinate according to the most common convention. 

The following proposition describes complex multiplication in terms of polar 
coordinates. 


Proposition 9.3. Let z; = r)(cos 6; +i sin 9) and z2 = r2(cos 63 + i sin 62). Then 
2122 = rir2(cos(O; + 62) +i sin(@,; + 62)). 
Proof We multiply and use the addition formulas for sine and cosine: 


r,(cos 6; + 7 sin 9) ro (cos 62 + i sin 62) 
=Pr\ro ((cos 6; cos 62 — sin 6; sin 62) + i(sin 6; cos 62 + cos @ sin 62)) 
= r1r2(cos(6, + 2) + i sin(6, + 6). 


A special case is the rule known as de Moivre’s theorem, 


(cos6 +isin@)" =cosn@+isinné, (néN). 


9.2.1 n‘ Root of a Complex Number 


Using de Moivre’s theorem we can show that every complex number has an n" root. 
It is noteworthy that to express it we use transcendental functions, albeit elementary 
ones, seemingly moving outside the realm of algebra. 
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Fig. 9.1 The 5th roots of 1 


Let w = r(cos@ +i sin@) where r = |w|. Then one n" root of w is 


*(coe(Z) +i0(Z)) 
a=r"|{cos|—]-+7sin{ — : 
n n 
Others are 
6 + 270k 6+ 27k 
Ke (cos ("==") + isin (*)) Soe a1 le 
n n 


where a@ was the root given first and 


7 = cos {| —}]+7SsSiIn{ — }. 
n n 


These n" roots, a total of n of them, are all distinct if w # 0. 

Actually these are all the n roots of w, as they are roots of the polynomial 
equation x” — w = 0, and it is known from algebra that a polynomial equation of 
degree n cannot have more than n roots. 

The numbers nf’, k =0,1,...,n — 1 are the n™ roots of 1, and may be visualised 
geometrically as the vertices of a regular polygon with n sides inscribed in the unit 
circle as illustrated in Fig. 9.1. 


th 


9.2.2 Logarithm of a Complex Number 


Properly, functions of a complex variable need a text of their own. The reason for 
introducing logarithm here, perhaps prematurely, is the light that it throws on the 
integration of rational functions. 
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Let z be a complex number, not of the form x + Oi with x < 0. We define 
Log z = In|z| + iArg z. 
The use of upper case “L” is conventional here, the function being sometimes called 
the principal logarithm. 
Now we allow the differentiation of functions of a real variable with values in the 


complex plane. This is accomplished in the most obvious way. A function f : A > C 
(where A is an interval of real numbers) has the form 


f@) =u(t) +iv(t) 
where u and v are real-valued functions with domain A. We define 
f(t) =u'(t) +iv'() 


for all t at which u and v are differentiable. 
As an example we can consider the parametrisation of the unit circle 


f(t) =cost+isint. 


Differentiation gives 
f'@® = —sint +icost, 


and, identifying a complex number with a plane vector, we can interpret this as saying 
that the velocity vector is normal to the radius and has length (speed) 1. 

Now let a 4 0 and consider the function Log (x + ia) of the real variable x. We 
have 


eas (x +ia) = , (—0O <x < 00). (9.1) 


dx x+ia 
Note that there is no need to restrict x to positive values only. 
We also have, for a positive integer exponent n, 


d 1 n 
= ; 2 
dx (@ diay @ iat’ (—0o <x < 00) (9.2) 


The proofs of these formulas are left to the exercises. They are important because 
they provide antiderivatives for functions that arise naturally when the fundamental 
theorem of algebra is applied, in its more usual complex form, to the decomposition 
of the rational function P/Q into partial fractions. According to the fundamental 
theorem the polynomial Q (with leading coefficient 1) has a factorisation 


O(x) = (& — a)"...(% — a) 
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with complex numbers a;. The fraction P/Q decomposes into 
olynomial + 


The transcendental part of the integral can then be expressed as a sum of logarithms 
of the complex first-degree polynomials x — a;. Of course this implies a connection 
between arctangent and the principal logarithm. It is left to the reader to show that 


1 
arctan x = 5, (Log (x —i) — Log (x + i)) + -. (9.3) 
i 


Complex numbers can also simplify the determination of the coefficients in the 
partial fractions decomposition. Consider the case of a polynomial Q(x) that has 
only simple roots, including all its complex roots. Then 


Q(x) = (x — a1) — a2)...(% — On) 
with n distinct first-degree prime factors, where the numbers a; may be complex. 


Still thinking of x as a real variable, we can differentiate. Leibniz’s rule applies (see 
the exercises) so we find 


O'(x) = Fix) + Fox) +--+ + Fal) 


where F(x) is the polynomial obtained by omitting the factor x — a; from Q(x). 
Since Fx (aj) = Oif k A j we have 


Fj(aj) = Q'(a;). 


Now we have a simple formula for the partial fractions decomposition: 


1 = 1 1 
= ; 9.4 
O(x) Ze O'(aj) x — a; wae 


j=l 


More generally, if P(x) is another polynomial with degree lower than that of Q(x), 
we have 


P(x) “> P(q;) 1 (9.5) 


Q(x) O'(aj) x - aj 


A theory of differentiation with respect to a complex variable is not needed for 
this simple case. We create the polynomial Q’(x) by differentiating Q(x) in the 
normal way and then substituting a; for x. If Q has multiple complex roots a similar 
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formulation is possible using higher derivatives of Q. However, to treat this more 
general case properly it is best to use the theory of complex analytic functions. 

As we have seen, the antiderivative of a rational function can be expressed without 
using arctangents, if we use instead the principal logarithm of complex first-degree 
polynomials. Such formulations are often thrown up by computer algebra programs 
that know how to find antiderivatives of elementary functions and understand com- 
plex numbers. Thus, given the roots of the polynomial Q in the complex plane, and 
assuming that they are simple, we have 


P(x) Sn P@) 
Q(x) 7 O(a)) 


Log (x — aj). (9.6) 


To ensure the validity of this formula we do not want x — a; to be a non-positive, 
real number. This is ensured if x > aj, for all j such that a; is real. If a; is real and 
x <a; we simply replace x — a; by a; — x in the corresponding term. 


9.2.3 Exercises 


1. Prove the formula 


( Imz ) 
Arg z = 2 arctan {| ———— }, 
|z| + Rez 


given that z is not of the form x + Oi with x < 0. 
Hint. A proof by geometry is easiest. 

2. The real factorisation of x* + 1 into irreducible quadratic factors is needed to 
calculate the integral { 1/ (x* + 1) dx. Obtain the factorisation using complex 
numbers by noting that the roots of x*+ + 1 = 0, the four complex numbers 


1+i -1l+i -l-i 1-i 
wp= , W= , W= » Ws ; 
1 Vi 2 Vi 3 Vi 4 Va 


form the corners of a square, and that the quadratics, (x — w;)(x — w4) and 
(x — W2)(x — w3), have real coefficients. 

3. Let n = cos(27/5) + i sin(27/5). The four non-real fifth roots of 1 are 7, n’, 
1 and \*. 


(a) Show that they are the roots of the polynomial 
xetx8tx2? txt. 
(b) Express the sum n + 7? + n° + / in terms of A := cos(27/5). 


V5 —1 
es 


(c) Deduce that cos(27/5) = 
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(d) Show that 
xp tx? tut] = (x? — 2x + 1) (x? — 2(20? — 1x +1). 
Note. Item (c) shows that the number 4 is constructible with straight edge and compass, and 


therefore the regular pentagon is also constructible. One such construction is described in 
Euclid (book 4, prop. 11). 


. Inthe previous exercise an expression was obtained for cos(27r/5) as an algebraic 


number involving a square root. We now express angles in degrees, defining x° = 
x7 /180. The formulas sin 30° = 5, sin 45° = 1//2 are doubtlessly familiar. 
We also have cos 72° = (./5 — 1)/4 by the previous exercise. Formulas such as 
these are sometimes said to give “exact” values of the circular functions. They are 
characterised by including only arithmetic operations on rationals, and radicals, 
possibly nested. 

Find exact values of the following circular functions of the given angles: 


(a) sin 36° 
(b) cos 36° 
(c) sin 6° 
(d) cos 6° 
(e) sin 3° 
(f) cos 3° 


Conclude that the sine and cosine of any multiple of 3° are expressible exactly 
with square roots. 


. Derive the rules for differentiation of product and reciprocal 


d bn ee OG) 7 
TSO =F8+ fs. <( )- f 
for complex-valued functions f and g of a real variable x. 
Hint. It should be obvious just by looking at the real and imaginary parts that 
1/f is differentiable if f is. Knowing this means that the rule for reciprocal can 
be obtained with little effort from the rule for product. 


. Derive formulas (9.1) and (9.2). 
. Derive formula (9.3). 


Hint. Differentiate the formula. 


. Express the antiderivative 


1 
ai 


using the principal logarithm of first-degree complex polynomials. 


. Let w=a+ib and let f(x) = e“(cosbx +isinbx) for real x. Show that 


f'@) = wf). 


Note. We have not yet defined exponentials of complex numbers. When we study the unifica- 
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10. 


11. 


12. 


tion of exponential and circular functions in Chap. 11, we will define e@+!)* and show that it 
equals e“* (cos bx + i sin bx). 
We study Cardano’s solution of the cubic equation. 


(a) Show that —b — c is a root of x* + px +q = Oif b and c satisfy 


be= — =, B+e=¢. (9.7) 


Hint. One way is to use the identity 
a+bh4+c —3abe = (a+b+c\(a’ +b? +c? —ab—be—ac). 


(b) Show that the solutions to (9.7) are given by 


lg balqgt-ap a7 
oa a a P (9.8) 


~ 3b 


for any choice of the two possible square roots and the three possible cube 
roots. 

(c) Show that all solutions to x? + px + q = Oare obtained by fixing the square 
root in (9.8) and using all three cube roots. 

(d) Show that the substitution y = x — a/3 reduces the general cubic equation 
x? + ax* + bx + c = 0to an equation (for y) of the form y* + py +q = 0, 
that can then be solved by the preceding method. 


In this problem we assume that the coefficients p and q in the cubic equation 
are real. In the formula for the solution in the previous exercise the quantity 
D := q? +.4p?/27 plays a crucial role. Note that —27D is what in algebra is 
called the discriminant. If D is positive then all three roots can be found by 
taking the square root of a positive number followed by the cube root of a real 
number. If D is negative it looks as if we are forced to find the cube root of a 
complex, non-real number. 


(a) Show that D = Oif and only if the graph y = x? + px + q is tangent to the 
X-axis. 

(b) Show, by examining the graph for example, that D > 0 if and only if there 
is one real root and two complex, non-real roots. 


The third case, D < 0, is the case when there are three distinct real roots. This 
was known as the casus irreducibilis. It can be shown that in this case there is 
in general no way to express the roots using only real radicals (that is, square 
roots, cube roots etc. of real numbers). 

We study the trigonometric solution of the cubic equation x? + px +q =0 
with real coefficients p and q. In the case D < 0, (see the previous exercise), 
there are three real roots and they cannot be found using Cardano’s method 
without taking the cube root of a non-real number. The roots can be found 
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more easily by exploiting the triplication rule for the cosine function, cos 30 = 
4 cos? 6 — 3cos 6. In this exercise we assume that D < 0. Note that this implies 
that p < 0. 


(a) Let x = A cos @. Show that x satisfies the cubic equation if 


4 = /-4p/3 and cos30 = —4q/2?. 


Check that |4q/A3| < 1. 
(b) Taking any @ that satisfies the second equation, show that the three real 
solutions of the cubic equation are 


20 An 
Acos@, Acos ane , Acos CR a ; 


In this series of exercises we look at the Chebyshev polynomials. 


(a) Show that for all natural numbers 7 there exist polynomials 7, (x) and U,, (x), 
each of degree n, that satisfy 


T, (cos 0) = cosné, U,, (cos 8) sin@ = sin(n + 1)0 


for all 0, and find explicit expressions for them. 
Hint. Use de Moivre’s theorem. 

(b) Show that in the interval [—1, 1] both 7,, and U,, have n distinct roots. 

(c) Show that the polynomial 7;, satisfies |7,,(x)| < 1 in the interval [—1, 1], 
and that it attains the value | at n points in [—1, 1]. 

(d) You may not think the explicit formulas (see item (a)) very appealing as ways 
to calculate 7, and U,,. However, it is easy to compute them recursively. Show 
that both sequences of polynomials satisfy the same recurrence relation 


Tr42(X) = 2xTn41(%) — Thx), Unga) = 2xUn41%) — Un (x) 


though with different initial conditions. 


Chapter 10 ®) 
Complex Sequences and Series sive 


I mean the word proof not in the sense of the lawyers, who set 
two half proofs equal to a whole one, but in the sense of a 
mathematician, where half proof = 0, and it is demanded for 
proof that every doubt becomes impossible. 


C. F. Gauss 


The main task of this chapter is to extend the theory of real sequences and real series 
to complex sequences and complex series. In Chap. 3 we dealt predominantly with 
positive series and stopped short of saying anything useful about real series that were 
not positive. This shortcoming will be amended here. 


10.1 The Limit of a Complex Sequence 


Let (Z,)7°., be a sequence of complex numbers and let w be a complex number. 


Definition The number w is said to be the limit of the sequence (z,)°°, if the 
following condition is satisfied: 


For every & > 0, there exists a natural number N, such that |Z, — w| < € for 
alln> WN. 


Formally this definition is the same as that for convergence of real sequences; 
only that the absolute value is reinterpreted as the modulus of a complex number. 

A complex sequence (z,,)7°., is said to be convergent if there exists some w, such 
that lim), 45 Z) = w. If there is no such w then the sequence is said to be divergent. 
We will not make any use of infinite limits in the complex realm. 
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Let Z,) = dy + byji and w = s + ti where a, b,, s and t are real numbers. That is, 
An = Re Zp, b, = IMZ,, 5 = Rew and t = Imw. By the rules for complex numbers 
listed in 9.1 we have 


lan —s| < lZn = w| and Dn — tl < [Zn = wi, 
so that if lim,_..5 Zz, = w then 


lim a,=s and lim db, =t. 
noo noo 


But the converse of this last statement is also true because 
7 2 2 
Zn — WI =a, — S| + |bn — t|°. 


We have proved the following. 


Proposition 10.1 A complex sequence (Zn)P°., satisfies liMy oo Zn = w if and only 
if liMp—+oo Re Z, = Rew and limy-... Im z, = Im w. 


This also shows that if a complex sequence has a limit, then the limit is unique 
(because that is known for real sequences). Another important conclusion is as fol- 
lows. 


Proposition 10.2 Let limy-.o9 Zn = w. Then limy-+o0 |Zn| = |wl. 


Proof It follows from the inequality [Iz = wI| < |z — w| stated in Proposition 9.2. 


Cauchy’s principle of convergence for real sequences (Proposition 3.12) extends 
almost without change to complex sequences. 


Proposition 10.3 (Cauchy’s convergence principle) A complex sequence (Zn)p2., is 


convergent if and only if it satisfies Cauchy’s condition: for each ¢ > O there exists 
N, such that |Z) — Zm| < € for all n and m that satisfyn > N andm > N. 


Proof Let a, = Rez, and b, = Imz,. Since (z,)°2, is convergent if and only if 


(a,)°2, and (b,)°°, are both convergent real sequences, it suffices to show that 
(Zn)p2., satisfies Cauchy’s condition for complex sequences if and only if (a,)°° ; 
and (b,)°°, satisfy Cauchy’s condition for real sequences. But that is obvious in 
virtue of the inequalities (all of which appeared in Sect. 9.1): 


lan — Gm| S [Zn — Zml; Ibn — Bin| < |Zn — Zml 


Zn ~ Zm| = V2 max(|dn = an, IDn = Din|)- 
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10.2 Complex Series 


A series of complex numbers yas Zk is said to be convergent, and its sum is the 
Be ak ni : 

complex number w, when the sequence (s,,)°° ,, given by s, = )7;_, Ze, is convergent 

and limg_. 95 S,) = W. 


Proposition 10.4 A complex series \--°, zx is convergent if and only if it satisfies 
the following condition: for each ¢ > 0 there exists N, such that for all m and n that 


satisfy N <m <n we have 
n 


a 


k=m 


<&. 


Proof Since sy — Sm—1 = ae Zx, the condition of the proposition is equivalent to 
Cauchy’s condition (Proposition 10.3) applied to the sequence (s,,)°° |. 


The condition of the proposition may also be written as follows: for each e > 0 
there exists NV, such that for all m that satisfy m > N, and for every natural number 
Pp, we have | aed a;,| < ¢. This throws into relief the point that there is no upper 
limit placed on the separation of m and n. 


10.2.1 Absolutely Convergent Series 


Proposition 10.5 Let )°7~, z, be a complex series and assume that the positive 
series), |Ze| is convergent. Then the series )~;~_, zx is convergent and we have 


Proof Let ¢ > 0. Choose N, such that for all m and n that satisfy N <m <n we 
have )o pm |Ze| < €. Now if N < m <n we have 


n 
< >> lel <€é 


k=m 


n 


a 


k=m 


and so, by Proposition 10.4, the series )~y_,,, Zx is convergent. 
Moreover, we have 
n oo 
< lel < >) lel, 
k=l k=l 


and therefore going to the limit and applying Proposition 10.2, we obtain 


n 


ye 


k=1 
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0° 
Yo zk 


k=1 


A series }“?°., zx, which is such that the positive series )~7~ , |z,| is convergent, 
is said to be absolutely convergent. As we have just seen, it is then convergent. The 
inequality in Proposition 10.5 may be viewed as an infinite version of the triangle 
inequality. 


10.2.2 Cauchy’s Root Test 


Convergence tests for positive series can be applied to the series of moduli of a 
complex series; they are therefore also tests for absolute convergence of complex 
series. The most important of these tests were covered in Chap. 3. We include another 
here, Cauchy’s test, also known as the root test. 


Proposition 10.6 Let )\¢°, a, be a positive series and suppose that the limit 


limp 00 a,!” exists and equals t. The following then hold: 


(1) Ift <1 the series is convergent. 
(2) Ift > 1 the series is divergent. 


There is no conclusion if t = 1. 


Proof If t < 1 we choose s, such that t < s < 1, and choose N, such that a <s 


for all n > N. Then we have a, < s” for all n > N, and the series is convergent 
by comparison with the geometric series )° 5”. If ¢ > 1 then an! "> 1 when n is 
sufficiently high, and then a, > 1 and cannot tend to 0. The series aa a, then 
diverges. 

The lack of a conclusion if t = 1 is illustrated by the series )°~ , n7 


ner ae 


We showed in Proposition 3.25 (in the nugget on limits inferior and superior), that 
if limy 560 An41/a, = t (with all denominators strictly positive), then we also have 
limy—soo Gn” = t. Hence, if a conclusion can be obtained from the ratio test, it can 
also be obtained from Cauchy’s test. However, there are cases when Cauchy’s test 
works, but the ratio test gives no conclusion. 


' and 


10.2.3, Extended Forms of the Ratio and Cauchy’s Tests 


In this section the series “°° a, is a positive series. The results are therefore applica- 
ble to proving that a complex series is absolutely convergent. We list some versions, 
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occasionally useful, of the ratio test and Cauchy’s root test with weaker conditions 
than the usual ones. They are weaker in the sense that they do not require a limit, 
only an inequality. The proofs, all easy, are left to the reader. Some of them can be 
expressed using limit superior; it is left to the reader to see how. 


Ratio Test 


(a) Assume that a, 4 0 for all n. Instead of requiring that lim; Gn41/an = t and 
t < 1, it suffices for convergence to assume there exist N and s < 1, such that 
An4i1/Q, < 8 foralln > N. 

(b) Assume that a, 4 0 for all n. Instead of requiring that limp dn41/an = t 
and ¢t > 1, it suffices for divergence to assume that there exists N, such that 
An+i/a, = | foralln > N. 


Root Test 


(c) Instead of requiring lim,-... al! "=f andt <1, it suffices for convergence to 


assume that there exist N and s < 1, such that al! "< § foralln > N. 
(d) Instead of requiring limy-,.. a" =t> 1, it suffices for divergence to assume 
1/n 7 F 
that a,’ > 1 for infinitely many n. 


10.2.4 Conditional Convergence: Leibniz’s Test 


A series that is convergent but not absolutely convergent is said to be conditionally 
convergent . An example of such a series is 


(oe) 


2d: at a (-1)""! 
1 tp ek eee 
a3 a 5 2 n 


the sum of which is In 2, as we shall see. This is an example of an alternating series, 
meaning that the terms are alternately positive and negative. 
The following test is called Leibniz’s test, or, the alternating series test. 


Proposition 10.7 Let (a,)°° , be a sequence of positive numbers, that are decreasing 
and tend to 0. The following conclusions hold: 


(1) The series yi (-1)"" lay is convergent. 

(2) Let Sy) = Yy_\(— 1) bay and s = O.(— 1) bag. Then s2n_1 tends to s from 
above and s2, tends to s from below. 

(3) We have the error estimate |s — Sy| < an41. 


Proof The reader is invited to supply the proof, by induction, that 


SQ < 84 < +++ < San < San] < +++ < 83 < SY] (n = 1, 2, 3,...). 
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It follows that the sequence s2, 1s increasing and bounded above, whilst the sequence 
S2n—1 18s decreasing and bounded below. We conclude that both sequences are conver- 
gent. But 52,1 — S2, = a2, — 0, so that both have the same limit s, which is then 
the limit lim,_, 0 5,. This proves conclusion 1. 
We next note that 
San < S < Santi < S2n-1, 


which gives 
S — S2n < S2nt1 — S2n = 42n+1 


and 
S2n—-1 — S < S2n-1 — S2n = Q2n- 


This proves conclusions 2 and 3. 


Now we are going to study the series }°°° , (—1)”"!/n which, as we said before, 
has the sum In 2. We are going to change the order of the terms. 

We separate the terms of the series into two sequences: the positive terms form the 
sequence (1/(2n — 1))?° ,, and the negative terms the sequence (—1/2n)°° ,. From 
these two sequences we shall build a new series that has exactly the same terms as 
the original series, but presented in a different order. 

We take the first positive term, then the first and second negative terms, then the 
next positive term, then the next two negative terms and so on, always taking one 
positive term followed by two negative ones. Proceeding rather recklessly in the spirit 
of the mathematicians of the eighteenth century, we write this down and calculate: 


Die Bs Bae Ei 
2 4 3 6 8 5 10 12 7 14 


=0-3)-3+G-a)-stG-w-atG- wt 


(10.1) 


The striking conclusion that we wish to draw is that the sum of the series in line 1 of 
(10.1) is different from that of the series ar (—1)""'/n, although both series have 
the same terms, but presented in a different order. However, some doubt may linger 
over the validity of the first equals sign. We have not shown that the series in line 1 
is convergent. Although the series in line 2 is convergent (being )°~ , (—1)""'/2n), 
it is not the same series as in line 1. 

A more rigorous argument might proceed as follows. Let s,, be the sum of the first 
n terms of the series in line | of (10.1). Taking three terms at a time we have 
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2n 
ihe Ae 4 ow ee 
ee Bae gee age | 


Hence 53, — In2/2. Now we note that the nth term of the series in line | tends to 
zero as n tends to infinity. Hence we conclude, by an easy argument left to the reader, 
that s, > In2/2. 

It may seem a contradiction; by changing the order of the terms we obtain half 
the original sum. But it simply means that the commutative rule that holds for finite 
sums does not hold for infinite ones; and, after all, an infinite sum is really a limit. 
For absolutely convergent series things are much tamer. 


10.2.5 Rearrangements of Absolutely Convergent Series 


A permutation of a set A is a bijective mapping ¢ : A — A. It is often convenient 
to picture a permutation as a table. For example 


n|1234567 
o(n)\2713546 


describes a permutation of the set {1, 2, 3, 4,5, 6, 7}. An infinite example might be 


67 8 9 10 11 ... 
85 10 12 7 14... 


This is a permutation of N,. Is it obvious how to go on? 

In fact a permutation of N, may be viewed as a sequence of positive integers in 
which every positive integer appears exactly once. To describe one a table may be 
impracticable. In some cases a formula for ¢(”) may be available. However, it may 
not always be convenient to specify #(n) by a formula; a verbal description may be 
deemed enough to define it. The main thing is to ensure that every number in the 
upper row of the table occurs exactly once in the lower row. 

A permutation @ of the infinite set N, produces a so-called rearrangement of the 
series }-°° | dy. This is the new series }°>° , dgin)- 


Proposition 10.8 Let ~~~, a, be an absolutely convergent series of real or complex 
terms. Let @ be a permutation of N,. Then the rearranged series \~~-, agin) is 
absolutely convergent and yk Agin) = ae an- 
Proof Let e > 0. Write b, = agin). By Cauchy’s principle, there exists N, such that 
aan |ax| < € for all m and n that satisfy N < m <n. More than this, because the 
terms are positive, the sum of a finite number of terms |a,;|, all of which have place 
numbers k > N, is less than e. The place numbers here do not have to be consecutive. 

There exists Nj, such that the numbers ¢(1), ..., @(N) include all the numbers 
1, ..., N. This is so, because, as we recall, every number in the first row of the table 
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representing the permutation appears exactly once in the second. If N; < m <n then 
all the numbers ¢(m), d(m + 1), ..., 6(m) are higher than or equal to NV. But then 


we have 
ne = Yl <6. 


k=m 


We conclude, by Cauchy’s principle, that the series )°7° , |by| is convergent. 

Lett = 7, dn ands = ~~, b,. We wish to show thats = t. Lete > 0. Choose 
N and N, as we did above. If n > N, then the numbers ¢(1), ..., 6(m) include all 
the numbers 1, ..., NV, and so 


n N 


by — y ak 


k=1 k=1 


é, 


=| Sas = ae 


k=1 


as all the terms in the sum ae 1 a& get cancelled. Letting n — oo we conclude that 
|s — RHE 1 &| < €. But now we find, applying the triangle inequality: 


lo) N lo) 
ls—t|= s-) ax| < s-) ax| + ) az} < 2e. 
k=1 k=1 k=N+1 


Since this holds for all ¢ > 0 we must have s = f. 


10.2.6 Exercises 


1. Test the following series for convergence. In each case determine whether the 
series is absolutely convergent, conditionally convergent or divergent. 


@ > — 


a 
(c) : a = ee 4) 

(d) s ce ESIC : ae +5) 
(ec) s ee ay 
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where a, b, c, d and e are real numbers and none of c, d or e is a negative 
integer (but e is not necessarily the base of the natural logarithm). 


= n+a 
© 2 (n+b)(n+c)(n4+d)’ 


n=1 
where a, b, c and d are non-real complex numbers. 
CO 
x 1 
(s) n+i 


n=1 


n+a 
© LL arHate’ 


where a, b and c are non-real complex numbers. 


2. Test the following series for convergence. In each case determine whether the 
series is absolutely convergent, conditionally convergent or divergent. 


@) >i (-a,)", 
n=1 


where a, = 5 if n is even and a, = ; if n is odd. 
Note. The ratio of each term to its predecessor is unbounded; so the ratio test does not 


Cm ae 
n=1 
(d) ae 


3 
ll 
fa 


= 
fo) 
Snane, 
Me 
oN 
— 
| 
(eo) 
io) 
a 
Vr 
3 | 
So 
NS 


3 
ll 
far 


& 
bas: 
= 
+ gS 
& 


= 
a) 
8 il 


re a, b and c are positive. 


ee ener een 
n De. 23 n)- 


[<2 


SS) 
ll 
_ 


3. Compute the limit 


| 
li : 
oe k +i 
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4. Let (a,)°° ; and (b,)°°, be sequences of complex numbers. 


(a) Suppose that the series paar la, |* and ar |b, |° are convergent. Show that 
the series er a,b, is absolutely convergent. 

(b) More generally, let p > 1 and g > | and suppose that (1/p) + (1/q) = 1. 
Suppose that the series er la, |? and a |b, |7 are convergent. Show that 
the series pe a,b, is absolutely convergent. 


5. Let (a,)°°., be aconvergent sequence of complex numbers and let limy_, 45 dy = Ww. 
Show that every rearrangement of the sequence has the same limit. More precisely, 
if @: N, > N, is a bijection then limy_,o0 dg(n) = W. 

6. Let (a,)7?2., be a bounded complex sequence and let }> 
convergent complex series. Show that the series )°-~ , 
gent. 


o, bn be an absolutely 


Anby is absolutely conver- 


10.3 Product of Series 


Given two sequences (a,,)°°., and (b,)°.) we can consider the set of all products of 
the form a,,b,,. The reason for beginning at place number n = 0, instead of n = | as 
before, is that this topic has important applications to power series, to be considered 
in Chap. 11. 

The set of products a,b,, do not form a sequence as they stand; they constitute 
a family of elements indexed by the set of all pairs (n, m) of natural numbers. It is 
most natural to see this family as an infinite two-dimensional array, as pictured here: 


dobo dob, aob2 aob3 eae 
abo a,b, ayb2 a,b; nee 
anbo anb, anb arb; ee 


a3bo a3b, a3b2 a3b3 Sais 


There are many ways to arrange the elements of the array as a sequence indexed as 
usual by the natural numbers. We can walk through the array taking in each element, a 
bit like walking through a large shopping mall and visiting every shop. The following 
diagram shows one way to do this: 
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agbo —> agby dgb2 => aob3 wei 


1 ‘i 1 

abo <_ a,b, abo ay bz woe 
1 ‘i 1 

anbo —- ayb, => ab» ayb3 see 
1 

a3by <— a3b, <— agbo <— azbz ... 


v 


To interpret the diagram you must follow the arrows. The effect is to take in 
square blocks of the array; a typical block consists of all terms a,,b,, for which 
max(n, m) < N, say, and is completed each time you visit the upper or left-hand 
edge of the array. 

Another way through the array, which turns out to be very important, is shown in 
the next diagram: 

agbo > dob, agb2 a agb3 eee 
x ai x as 

a\bo ab, ayb2 a,b; aoe 
+7 x a x 
arbo anb, ayb2 anb3 coe 


x ra x ee 


a3bo a3b, a3b, a3b3 see 
ee ae eee 


The procedure indicated is to collect whole diagonals. A diagonal consists of the 
n + 1 terms of the form a;b,_, for some n. These are assembled in increasing order, 
starting atn = 0, thenn = 1, 2, 3, etc. 

These are just the two most important ways to arrange the elements a,b, in a 
simple sequence. There are clearly infinitely many ways to do it. In the following 
proposition we make a remarkable claim that applies to any possible arrangement. 


Proposition 10.9 Assume that the series \~~ dy, and Y°~ 9b, are absolutely 
convergent. Set A = Y°> an and B = °°. by. Consider an arrangement of the 
products aybm, (m € N, n € N) in a sequence (d,)°-_4. Then the series peas dy is 
absolutely convergent and its sum is AB. 


Proof Given the natural number n there exists a natural number JN, such that all the 
products do, dj, ..., d, appear among the products that arise by multiplying out the 
expression 

(ag + a) +--+ +ay)(bo + bi +--+ + by). 


But then we have 
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Ido| + |di| +--+ + dnl S (aol + lai] +--+ + lawl) (1B0] + loi] +--+ + lan) 


< (> jal) (X bal). 


We conclude that the series °° 9 d, is absolutely convergent. Its sum is therefore 
independent of the order in which the terms are arranged. 
One possible arrangement is 


dgbo + doby + ayb, + aybo + anb2 + ayb2 + agb2 + anh + anbo 
+apb3 + a,b + ayb3 + a3b3 + a3b> + a3b, + a3bo aS 
eo el e2 


SS KalKIwtKe  a—oOo—vmol"vD)_ sch m_—_— >o>oaot) 
= dobo + (dob + ay by + ay bo) + (aob2 + ay bz + agb2 + azb; + azbo) 


e3 


————— 
+ (aob3 + a,b3 + arb3 + a3b3 + a3b2 + a3b, + a3b9) +--+ (10.2) 


Beneath “e,,” appear all products in which one of the factors is a, or b,. It corresponds 
to the collection-by-blocks arrangement pictured in the first diagram above. 
Now we have 


€9 + ey +++ + en = (4p + a1 + +++ + an) (bo + 1 +++ + dn), 


which tends to AB as n —> oo. The sequence of partial sums of the series }°°°_, e, is 
a subsequence of the sequence of partial sums of the series in the first line of (10.2). 
Since the latter series is known to be convergent we conclude that its sum is AB. 
The sum of the arrangement }°° ) d, is therefore also AB. 


10.3.1 Cauchy Product of Series 


By far the most important arrangement of the products a,b,, is by diagonals as shown 
in the second diagram. This leads to the series: 


agbo + agb + aybo + agb2 + ab; + azbo + aob3 + ayb2 + arb; + a3b9 +--- 


co e Cl £9: 
= agho + agby + aybo + agb2 + ayb; + arbo 


¢3 


————————_$—$$<_—$—_—_— 
+ agb3 + ayby + ab + a3bo +--+ 


Here we have 


n 
Ce. = ) aibj =) apbyn_K- 
0 


i+j=n k= 
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The series }°°° 4 cy is called the Cauchy product. By Proposition 10.9, if )°°.4 dn 
and }°”° 9 by are both absolutely convergent, we can conclude that 


S.-(EE9) 


n=0 n=0 


10.3.2 Exercises 


1. The series )°™ 9 (—1)"//n + I converges by Leibniz’s test. Let its Cauchy prod- 
uct with itself be the series °° 9 Cn. Show that |c,| > 1, so that °*° 4 cn diverges. 

2. There are more ways to collect the pairs (n,m) into a simple sequence than the 
two covered in the text. For example, we can group together all pairs (n, m) for 
which n and m have a given product. This is like a Cauchy product, but uses the 
product of place numbers instead of their sum. 
Let the series °° , a, and °° , b, be absolutely convergent, and let their sums 
be A and B, respectively. For each n we can take all pairs of place numbers 
(d,n/d) where d ranges over all the divisors of n (including 1 and 7). This gives 


the series )°~~ , c, where 


Ch = a aaDnyja- 


d\n 


Here, d|n means that d divides n, and instructs us to sum over all divisors of n. 
By Proposition 10.9 the series )°™°, c, is convergent with sum AB. 
The Riemann zeta function ¢(s) is defined for s > 1 by 


Show that 


where o (7) is the divisor function, that is, o (n) is the number of positive integral 
divisors of n, including 1 and n. 

3. Prove Mertens’ theorem. Let the series os dy, and bear b, be convergent, let 
yy an = A and )°™, b, = B. Assume that one of the series, let us say the 
first, is absolutely convergent. Then the Cauchy product is convergent and its 
sum is AB. 

Hint. Let 
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n n 


n n 
k=0 k=0 


k=0 k=0 


Show that Cy, = )7;_)a¢Bn—x« and use this to estimate C, — B )~;_,)a« for 
large n. 

4. In several places we have had occasions to sum a series by grouping its terms. 

This appeared in the proof of Proposition 10.9 and in our study of rearranging 
the series 5 i (—1)""'/n in Sect. 10.2. It was also used in the treatment of the 
positive series per n~P (see Sect. 3.8). 
In this exercise we shall define and study a general version of grouping. Begin 
with the series)“ , dn. We do not assume that it is convergent. Let (k,)°2, be 
a strictly increasing sequence of positive integers, beginning with kj = 1. We 
replace the series pea a, by the grouped series ae b,, where 


(a) Suppose that the series )~>° , a, is convergent. Show that the grouped series 
+ b, is convergent and has the same sum. 

(b) Suppose that a, — 0 and that the sequence k,+4; — k, is bounded above (in 
other words there is a cap on the number of terms that are grouped together). 
Show that the series }°~~, a, is convergent if and only if the grouped series 
>, bn is convergent. 

(c) Suppose that a, > 0 for each n. Show that the series ae ay iS convergent 
if and only if the grouped series )~*°, b, is convergent. 

(d) Give an example where the series )°~ , a, is divergent but the grouped 


series )°°° , b, is convergent. 


10.4 () Riemann’s Rearrangement Theorem 


Riemann proved a striking theorem that throws light on our experiments on rear- 
: : oo n-1 
ranging the series >", (—1)""'/n. 


Proposition 10.10 Let the real number series )-~ , a, be conditionally convergent 

and let t be either a real number, or else +-00 or —00. Then there is a rearrangement 
oo 

1 Aon) that has the sum t. 


Proof Itis important that the terms are real numbers. For each real number x, letxt = 
max(x, 0) and x~ = — min(x, 0). Then x = xt — x7 and |x| = xt + x~. Now the 
series )~ |a,| is divergent (because }* a, is conditionally convergent) and |a,| = 
a; +-a,. Hence it is impossible for both the series, )°°., a> and )°*~,a;, to 
be convergent. But if one is convergent so is the other, since )°°° (a, — a7) is 
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convergent. We conclude therefore that both the series )°™ , at 
divergent. 

From this point we will assume that no term in the sequence a,, is 0 (we can strike 
out all the zeros and put them back later if we so wish). We form two increasing 
sequences of integers, k, and £,, which are such that the terms a,, are all the positive 
terms of the sequence (a,,)°° , in order of increasing index; and the terms ay, are all 
the negative terms of the sequence (a,)°°.,, again in order of increasing index. Both 
these sequences can be defined formally by induction if we so wish. 

Now setc, = ax, andd, = da¢,.Itis helpful to imagine these two sequences spread 
out before us from left to right like cards from a pack: 


[o,e) = 
and }° "a, are 


C1, €2, €3, €4,C5,C6,... 1, da, d3, da, ds, do, ... 


The first sequence comprises all the positive terms, and the second all negative 
terms, of the sequence (a,)°° ,, taken in order of increasing n. We know from the 
first paragraph that )°°° , cn = co and °°, dy = —oo. But also that limo Cn = 
limy-s oo dy = 0, as follows from the fact that bee 1 Gn 1S convergent. 

We now describe a rearrangement that sums to t. We take the case that t is a finite 
number, leaving the two cases of infinite ¢ to the reader. Starting with the left-hand 
sequence c, (of positive terms) we take as many terms from left to right as are needed 
to make a sum higher than f¢, stopping as soon as ¢ is surpassed. Note that we want 
to go higher than ¢; if we land on ¢ we take an additional term. This can be carried 
out because )~~, cy = 00, so any number can be surpassed by a sum ae Cn if we 
take N large enough. 

Let the last term taken from the left-hand sequence be c,,. It is clear that the 
error, that is the amount by which os Cn exceeds ¢, is at most c,,. We continue the 
sequence C}, ..., Cp, with as many terms from the right-hand sequence as are needed 
to make, together with the positive terms already taken, a sum lower than ¢. Again 
this is feasible because )°°° , d, = —oo. Again we take only as many as are needed 
stopping as soon as the sum passes below ¢ (if we land on ¢ we take an additional 
term). 

Suppose the last term taken to be d,,. Then the sum so far, 


cytes tcp td +--+ +dy 


differs from ¢ by at most |d,,|. We return to the left-hand sequence from where we 
left off and take just enough positive terms to surpass f again, say from Cp,+1 to Cp,, 
then just enough negative terms from the right-hand sequence to pass below ¢ again, 
say from d,,+1 to dj, and so on. This process will never terminate, because all terms 
are non-zero and the series }*c, and )°d, both diverge, the first to +00 and the 
second to —oo. So in this fashion we construct a rearrangement )° agin) Of Y° an. 

The rearrangement sums to f. To see this observe first that if all the terms of a, 
satisfy |a,| < M, then the error, the absolute difference between a partial sum of our 
rearrangement and f, is less than M as soon as we reach the partial sum )~?! | Cy as 
described above. It remains less than M from there on. 
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Now let ¢ > 0. From some point on, from an index r say, all the terms c, and d,, 
with n > r have absolute value less than ¢. Eventually in our rearrangement we will 
have exhausted all terms to the left of c, and d,. After this all unused terms have 
absolute value less than ¢; so by the argument of the last paragraph, the error will 
drop below « after a finite number of terms are added on, and will remain below e. 
In other words our rearrangement sums to f. 


10.4.1 Exercises 


1. A proof like the one above that contains so much written text can leave a lingering 
doubt about its correctness. Dispel some of these doubts by writing a nice defini- 
tion by induction of the sequences k,, £n, Pn, Gn and the rearrangement mapping 
o(n). You may wish to use Proposition 2.1. 

2. Prove Riemann’s theorem for the cases tf = 00 and t = —oo. 

3. Show that a complex series }°>° , Zn is absolutely convergent if and only if the 

real series )>° , Rez, and }>>-, Imz, are absolutely convergent. Hence show 

that if a complex series )~~~ , z, has the property that all its rearrangements are 
convergent, then the series is absolutely convergent and all its rearrangements 


have the same sum. 


10.4.2 Pointers to Further Study 


— Series in Banach spaces 
— Orthogonal series 


10.5 () Gauss’s Test 


When the ratio test is used to test a positive series )°° , a, for convergence, it 
often happens that no conclusion is obtained because limy,-,65 Qn41/Gn = 1. A more 
delicate test is required. It is a trade secret that it is best to try Gauss’s test. 


Proposition 10.11 (Gauss’s test) Let )°°° , an be a positive series, such that a, # 0 
for all n. Assume that 

An+1 = 1 UL 4 K, 
An n nite 


where (K,,)°°_, is a bounded sequence, 4. a constant, andr a strictly positive constant. 
Then the series )\-~_ an converges if 4 > 1 and diverges if w < 1. 
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Here are some interesting points about Gauss’s test: 


(a) There is no indeterminate case. Once ju is found a conclusion is reached. 
(b) The equation is often written as 


an+1 Lh i 
=1 +0 ; 
An n (<=) 


The sequence c, is said to be O(d,,) (and we write c, = O(d,)) if Cy/dy is 
bounded as n — oo. See the nugget “Asymptotic orders of magnitude” for more 
about this notation. 

(c) If there exists a function g(x), such that g(0) = 1, g is twice differentiable at 
x =0, and ay41/d, = g(1/n), then u = —g'(0). 

(d) It is commonly the case that a,41/d, is a rational function of n and then wu is 
easily found by point (c). This explains the great utility of Gauss’s test and was 
the historical context for introducing it. 


The proof of Gauss’s test is in several steps, in each of which another test is 
introduced. 
Step 1. Kummer’s tests. 
Let (Dy)p°_, be a positive sequence and set 


aAn+1 
o(n) = Dy — Dayi—— 


n 


Then 


(1) Suppose there exists h > 0 and N, such that 6(n) >h for alln > N. Then 


[o.e) 
pe An converges. 


(2) Suppose )°~, D;! is divergent and $(n) < 0 for all n > N. Then Y~~, ay 
diverges. 
Proof of Kummer’s Tests (1) We have that 
Dyay — De ides1 = alk) = hag, (k= N). 


Sum from k = N tok =n: 


n 
Dyay — Dndn =h Yay, (n> N). 
k=N 


But then har ay < Dyan/h and it is bounded above as n > ov. 


(2) We have that Da, < Dyyiag4, for k > N, so that Dyay > Dyay and 
a, > Dyay D;,'. But then ~~, a, diverges since )~*. , D7! diverges. 
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Step 2. There are as many Kummer’s tests as there are ways to select the numbers Dy. 
Let us look at some examples: 


(a) D, = 1 for all n. 
Conclusions. If 


for n > N, then the series converges. If 


Qn+1 


1- <0 


an 
for n > N, then the series diverges. This is D’Alembert’s test (ratio test) in a 
slightly more general form. 
(b) D, =n—1. 
Conclusions. If 


for n => N, then the series converges. If 


n(1- 2) 
ay 


for n > N then the series diverges. This is called Raabe’s test. 


Step 3. Completion of the proof of Gauss’s test. 
The condition that the series satisfies implies that 


( aAn+1 ) 
n{ 1——]—- uw. 
an 
By Raabe’s test ((b) of step 2) the series converges if 4 > 1 and diverges if uw < 1. 
Finally we consider the case u = 1. So far Gauss’s test is just a special case 
of Raabe’s test. It is because Gauss’s test resolves the case jz = | that it merits its 
special status. We apply Kummer’s test with D,, = (n — 1) In(n — 1). We know that 


><. D7! diverges (or if not known it can be seen by Cauchy’s condensation test, 
Sect. 3.8 Exercise 12, for example). We have 


Qn+1 


o(n) = Dy _ Dnt 


an 


1 K, 
= (n= 1) Inf = 1) = nam (1 = > + ) 


nitr 
1 K, Inn 
Sethian) i=) =. 
n 


n" 
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Now K,, is bounded and lim,-... Inn/n” = 0, so that lim, K, Inn/n"’ = 0. We 
also have 


1 1 
tim (n= 1)1n (1) = tim (71) nc == im t) 
noo n t>0+ (tf t>0+ 


We find that lim,... d(n) = —1, so that @(n) < O when n is sufficiently big. We 


conclude that )*~ , a, diverges. 


1. 


In(l—t) | 
—-= 


10.5.1 Exercises 


1. Verify point (c). Let )° a, be a positive series. Suppose there exists a function 
g(x), such that g(0) = 1, g is twice differentiable atx = Oanday41/a, = g(1/n). 
Show that Gauss’s test can be applied with 4. = —g’(0) andr = 1. 

2. Use Gauss’s test to study the series °° , n~? where p is a real number. 

3. Use Gauss’s test to study the series 


s (a)n(b)n 
 (n(Dn 


where, for a real number f and natural number n, we define 


Re ifn =0 
eSNG eel cdl aa 1. 


We assume that neither c nor d is a non-positive integer in order to avoid zero 
denominators. Careful! The series terminates for certain values of a and b. 


10.5.2 Pointers to Further Study 


— Convergence tests. 


Chapter 11 M®) 
Function Sequences and Function Series sive 


I shall apply all my strength to bring more light into the 
tremendous obscurity which one unquestionably finds in 
analysis. It lacks so completely all plan and system that it is 
peculiar that so many have studied it. The worst of it is, it has 
never been treated stringently. There are very few theorems in 
advanced analysis which have been demonstrated in a logically 
tenable manner. 


N. H. Abel 


11.1 Problems with Convergence 


Consider the following example of a function series: 


: ze : O<x<1l 
(4). Osxsb. 


n=1 


With a function series, as here, it is important to specify carefully the domain of the 
functions in the series. It should be the same for all the terms. In this case it is the 
interval [0, 1]. If we fix a value for x within the interval [0, 1], we obtain a number 
series, in fact, a geometric series which is convergent, having the sum 


(4): ers a 


The function series is convergent for each x in the interval [0 , 1] and its sum is the 
function x. 
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Consider next the function series: 


For each x in [0, 1], such that x > 0, this is a geometric series with ratio 1/(1 + x), 
and the first term is x. This geometric series is convergent, since we have 
1/( +x) < 1, and its sum is 


1 1l+x 
X- = Xs =1+x. 


1-— — x 


But if we try to put x = 0 in the above equalities the first two members make no 
sense. However, there is an obvious limit as x tends to 0, namely, 1. On the other 
hand it is also obvious that if we put x = 0 in the terms of the series we obtain a 
series in which every term is 0. The sum is therefore 0. So the sum of the function 
series is a discontinuous function on the interval [0, 1], in spite of the fact that each 
term is a continuous function on the same interval. 

Of course the convergence of a series depends on the convergence of a sequence 
of partial sums. So the phenomenon exhibited here should first be studied in the case 
of function sequences. 

Consider then the function sequence 


farts) =x", O<x<1), n=1,2,3,... 


Computing the limit for each x in the domain, we obtain the function g : [0, 1] > R, 
where g(x) = 0 for 0 < x < | and g(1) = 1. Again we have a discontinuous limit 
of a sequence of continuous functions. 

In order to understand better what is happening here, let us fix x in the domain, 
such that x < 1. Now lim,... x” = 0. Let e > 0. We ask: how large must we choose 
N in order that |x” — g(x)| < ¢ forall > N? Here g(x) = 0. The obvious answer 
is: it suffices if N exceeds | In e|/| In x]. 

It is interesting how the lowest N found in the previous paragraph depends on x. 
Let us ask: can we choose N independent of x, such that |x” — g(x)| < « for all 
n > N and x in the domain? Can we use the same WN for all x and achieve an error 
less than ¢? The answer is no. We saw that the lowest available N for a given x must 
exceed |Ine|/|Inx|. But now lim,-,;~ | In ¢|/|Inx| = oo. Ever larger values of N 
are required as x approaches 1. On the other hand, when x = | it suffices to choose 
N=1. 

The phenomenon studied in the last paragraph, and pictured in Fig. 11.1, is par- 
ticularly sensitive to the choice of domain. On the domain [0, 1 — 4] (we fix 6 > 0) 
we can find N, that suffices, independent of x, namely, we can take N as the smallest 
integer that exceeds | Ine| , | In(1 — 5)| and achieve an error less than ¢. Forn > N 
and for all x in this restricted domain we have |x" — g(x)| < é. 
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Fig. 11.1 Non-uniform 
convergence 


Fig. 11.2 Uniform 
convergence 


11.2 Pointwise Convergence and Uniform Convergence 


Let (f,)°2. be a sequence of functions defined in a common domain A. 


Definition The function sequence (f,,)?°., is said to converge pointwise to a function 
g in the domain A if, for each x in A, we have lim,_... fr(x) = g(x). 


Definition The function sequence ( f,,)°°_, is said to converge uniformly to a function 


g in the domain A (or sometimes “with respect to A”’) if the following condition is 
satisfied: for each e > O there exists a natural number JN, such that for all x in A and 
for alln > N we have | f, (x) — g(x)| < e. 


Obviously if f, converges uniformly to g, it also converges pointwise to the same 
limit. Uniform convergence is stronger in that we require that N should be specifiable, 
for each given e, independently of x in the domain. Uniform convergence is illustrated 
in Fig. 11.2. 

The difference is apparent in the corresponding statements in quantifier logic, 
which eliminate all the ambiguity that may reside in everyday language. The order 
of the quantifiers is the only difference. First, f,, converges to g pointwise: 
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(Vx € A)(Ve > O)(AN EN)(Wn EN) = N => | fax) — g()| < &). 
Next, f,, converges to g uniformly: 


(Ve > O)(AN EN)(VWn E N)(Vx € A) = N & |fa(X) — g(X)| < 8). 


The reader should practise reading aloud these two sentences in a literal translation 
to ordinary speech. 

Changing the domain may make a difference. The function sequence (x")°° , 
converges pointwise in the domain [0, 1]. The convergence is not uniform. The same 
function sequence, but in the domain [0, 1 — 6], converges uniformly, as we saw in 
the last section. If there is some ambiguity about the domain in question we may say, 
“The sequence (f,)7°, converges to g, uniformly with respect to the domain B”. 


11.2.1 Cauchy’s Principle for Uniform Convergence 


Just as Cauchy’s principle for real number sequences gives a necessary and sufficient 
condition for convergence without needing a candidate for the limit, for function 
sequences there is a Cauchy’s principle for uniform convergence, that does not require 
us to guess the limit in advance. 


Proposition 11.1 Let (f,)~°., be a sequence of functions with common domain A. 
The following condition, called Cauchy’s condition, is necessary and sufficient for 
uniform convergence of the sequence (fy)r°_,: for all > 0 there exists N, such that 
foralln > N,m > N and x in A we have | fin(x) — fr(x)| < é. 


Proof Suppose the function sequence is uniformly convergent and let the function 
g be its limit. Let ¢ > 0. There exists N, such that for all n > N and all x in A we 
have | f, (x) — g(x)| < ¢/2. Now for alln > N,m > N and x in A we have 


é 


ee 


fn) = fal®)1 <lfn®) = FO) + ILO) = fr < 5 + 
Conversely, suppose that the function sequence (f,,)?°; satisfies Cauchy’s con- 
dition. Now for each fixed x the numerical sequence (f,(x))°°, satisfies Cauchy’s 
condition for a real sequence, and hence is convergent as a sequence of real numbers. 
Let its limit be the number g(x). This defines a function g with domain A. We shall 
show that /,, converges uniformly to g. 

Let ¢ > 0. There exists N, such that for all m > N,n > N and x in A we have 
| fin (x) — fn(x)| < €. We may let m tend to infinity in this inequality and deduce 
that |g(x) — f,(x)| < e, and this therefore holds for all n > N and all x in A. This 
shows that f,, converges uniformly to g. 
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11.2.2. Uniform Convergence and Continuity 


There follows a key reason why uniform convergence is so important. The proof is 
the locus classicus for what is called the e/3-argument. 


Proposition 11.2 Let (f,,)°° , be asequence of continuous functions on the domain A. 
Assume that the sequence converges uniformly on the domain A to a function g. Then 
g is continuous in the domain A. 


Proof Let c € A. We shall show that g is continuous at c. Let ¢ > 0. There 
exists N, such that | f,(x) — g(x)| < e/3 for all n> WN and all x € A. The 
function fy is continuous by assumption. Hence there exists 6 > 0, such that 
| fu (x) — fx(c)| < €/3 for all x in A that satisfy |x — c| < 6. We now find, if x 
isin A and |x —c| < 6, that 


Is(x) — g@©)| S ls@) — fr@l + lin @) — fr Ol + fv © — 8) 


aera 
ooo iS be Srey 
3°33 


The proof actually shows that g is continuous at c, given only that, firstly, each 
function f;, is continuous at c, and secondly, f,, converges uniformly to g in an open 
interval containing c. 


11.2.3, Uniform Convergence of Series 


Let Neath J, be a function series, such that each term has the same domain A. 


Definition The series }°~° , f;, is pointwise convergent on the domain A and its sum 
is the function g, if, for each x in A, we have limy-+o. bBo Sk(x) = g(x). 


Definition The series )°~° , f, is uniformly convergent on the domain A and its sum 
is the function g, if limpoo )-¢_1 fe(x) = g(x) uniformly with respect to x in A. 


We may be able to infer about a series ba Fn (x), that there exists a function 
g, such that the series converges uniformly to g, and yet we may not know anything 
about g. So to eliminate mentioning g at all, we simply say “The series )>7° | fx (x) 
is uniformly convergent”. 

The importance of uniform convergence of a function series is obvious: if each 
term is a continuous function, so is the sum function. For this to be useful we need 
a convenient test for uniform convergence of a series. Fortunately we have one. 


Proposition 11.3 (Weierstrass M-test) Let °°, fn be a function series where the 
terms have a common domain A. Assume that there exists a sequence of positive 
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numbers (M,,)°°,, such that the series Disa M,, is convergent, and such that for all 


x in A and for all n we have | fn(x)| < Mn. Then the series \~-~_, fu is uniformly 
convergent. 


Proof The numerical series )°~ , f, (x) is convergent (indeed absolutely conver- 
gent) for each x in A by the comparison test. The sum is then a function g with 
domain A. 

So far, the convergence to g is only pointwise. Let e > 0. Since the series )°° , M, 
is convergent, there exists NV, such that pares nv M; < e. It follows, for alln > N and 
all x € A, that 


fe — Do fier =| >) f@®|< Do |A@|< DO Mm<e. 
k=1 k=n+1 k=n+1 k=n+1 


The convergence is therefore uniform. 


11.2.4 Cauchy’s Principle for Uniform Convergence of 
Function Series 


We can apply Cauchy’s principle for uniform convergence of function sequences to 
the study of the function series yee Jn(x), with common domain A. We find that 
the series is uniformly convergent if and only if it satisfies the condition: for alle > 0 
there exists N, such that for all m and n that satisfy n > m > N, and all x in A, we 


have p 
YS fe) 


k=n+1 


aero 


Weierstrass’s M-test supposes that the terms f,,(x) are bounded in modulus by 
constants M,, > 0, such that poem M, < ©. Then we have, on choosing N so that 
Yew Me < €, that 


m 


< ae | fe(x)| < » M,; <&, 


k=n+1 k=n+1 


YS fe) 


k=n+1 


for all x in A, provided only that m > n > N. This is another proof of Weierstrass’s 
test, appealing to Cauchy’s principle, but the perspicacious reader will see that it is 
really the same as the first one. 

The Weierstrass M-test can only prove uniform convergence of a function series 
if, for each x, itis an absolutely convergent numerical series. To prove that a function 
series, that is conditionally convergent for certain values of x, is uniformly convergent 
can be trickier, but Cauchy’s principle can be a valuable tool as we shall see. 
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11.2.5 Integration and Uniform Convergence 


Let (f,)°°., be asequence of functions integrable on the interval [a, b]. If f, converges 
pointwise to a function g on [a, b], we cannot in general infer that 


b b b 
tim [f= f lim fa= fs. 
t= CO: a a noo a 


The first equality, involving the interchange of limit and integral, may be inadmissible. 
In fact the function g may even fail to be integrable. And even if it is integrable it is 
not guaranteed that equality holds. 

We ask: when is it permissible to interchange the operations of limit and integral? 
We have an extremely useful sufficient condition. 


Proposition 11.4 Let (f,)0°., be a sequence of continuous functions on the interval 
[a, b] and assume that limy-. 5. fn, = g uniformly on [a, b]. Then g is integrable and 


Proof In the first place g is continuous and therefore integrable on [a, b]. Let e > 0. 
We choose N, such that | f, (x) — g(x)| < e/(b — a) foralln > N andall x in [a, dD]. 
For such n we find 


fa-fe 


That is, lin soo f” fr =f. 8: 


€ 
b-a 


=€. 


b 
<|/ ele) 


And its counterpart for series, for which the proof should be obvious. 


Proposition 11.5 Let), f, be a function series, where each term is a continuous 
function on the domain [a, b]. Assume that the series )°~- , fy is uniformly convergent 
on the domain [a, b] and let its sum be the function g. Then 


fe-d fm 


The proposition is sometimes loosely described in the following way: it is per- 
missible to integrate a function series term-by-term when the series in question is 
uniformly convergent. It is surprising how often one wants to integrate a function 
series term-by-term, so we are very glad to have this proposition. 

It also introduces a seminal theme. Often it happens, just when we wish to inte- 
grate term-by-term, that uniform convergence is wanting. More flexible criteria for 
allowing this do exist for the Riemann integral, but it is preferable to adopt a more 
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advanced integration theory, to wit, the Lebesgue integral, that allows for term-by- 
term integration under much weaker conditions. 


11.2.6 Differentiation and Uniform Convergence of Series 


After term-by-term integration we turn to term-by-term differentiation. First we look 
at the interchange of differentiation and limit for function sequences. This is a trifle 
more complicated than for integration. 


Proposition 11.6 Let (f,)° , be a sequence of differentiable functions on a common 
open interval A such that the derivatives f, are all continuous. Suppose that there 
exist functions g and h with domain A, such that for each x in A we have 


lim Fr(x) = g(x), and lim f, (x) = h(x). 


Assume that the second limit is uniform with respect to A. Then g is differentiable in 
Aand g' =h. 


Proof The function h is continuous by Proposition 11.2. Fix a in A. By the previous 
proposition and the fundamental theorem we find, for all x € A, that 


i h= lim | fe = lim, (fn(x) — fn(a)) = g(x) — g(a). 


But then g’ = h by the fundamental theorem. 


Now for series we have the following: 


Proposition 11.7 Let )°~, f, be afunction series on the open interval A, such that 
each function f,, is differentiable and the derivative f/ is continuous. Suppose that 
the series seem Sn(x) and pam, f; (x) are convergent for each x € A and set 


g(x) =>) fle), AG) =O Ai), (eA). 


n=1 n=1 


Assume that the second series is uniformly convergent. Then g is differentiable in A 
and g' =h. 


Proof Fix a € A. For all x € A we have 


(oe) (oe) 


g(x) — g(a) = ¥-( file) — fala) = > fi= fo 


n=1 n=1 


and the conclusion follows by the fundamental theorem. 
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11.2.7 Exercises 


In problems where the uniform convergence of a function sequence (f,)°°, is to 
be assessed on an interval A, it may be straightforward to determine the pointwise 
limit g. It can then be useful, in order to decide about uniform convergence, to 
determine the maximum of | f,, — g| in A. Sketching the graph of f,, and g can also 
be helpful to orient one’s thinking. 


1. Determine whether the following limits exist, and are attained uniformly or merely 
pointwise, on the stated intervals A: 


li , A=[0,1 
(a) im oem] [0, 1] 
(b) lim A= [0,1] 


n>oonx + 1’ 


(c) lim , Az=[6, 1], where 0 < 6 < 1. 
n>onx +1 


(d) lim 


> 
noo nx +1 


A = (0, 1]. 
2. Determine whether the following limits exist, and are attained uniformly or merely 
pointwise, on the stated intervals A: 
(a) lim nx(d—x)", A=[0, 1] 
noo 
(b) lim x” —x"), A= [0,1] 


n—->>oo 
2 
(c) lime™, A=R 
noo 1 
(d) lim -e”, A=R 
n>on 


(e) lim , A=[0, 1] 
n>onx +1 
(f) lim xe, A=[0, col 


n—>oo 
nx 


(g) lim xe", A=[6, o[, where 6 > 0. 


no 


1 
(hy) lim /x?+>5, A=R. 
n—>0o n 


3. The period of a pendulum of length @ swinging in a uniform gravitational field of 
strength (acceleration) g is given by 


eg [7" 1 
T(k) =4 / 
& Jo 1—k? sin? 


where k = sin(@/2) and is the angular amplitude of the swing. Prove that 


£ 
lim T(k) = 27] —. 
k>0 g 
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Note. The limit is called the period of small oscillations and is the familiar approximate formula 
for the period of swing given in books on elementary mechanics. For large values of k, the 
approximation may be too inaccurate and then one would want to calculate the integral. We have 
here another of Legendre’s standard elliptic integrals. For a surprising method to approximate 
it see Exercise 5 below. 

4. Let (a,)°, and (b,)°°., be positive sequences and suppose that 


lim a, = lim b, = 1. 


noo noo 


Show that 1 
lim =1 
noo ./q, sin 2t + by cost 


and that the limit is attained uniformly for 0 < t < 27. 


Deduce that 
Qn 1 
lim : dt = 27. 
n> Jo a) in sin2t + b, cos?t 


5. The integral in the preceding exercise (actually another so-called elliptic integral) 
can be computed numerically by using the arithmetic-geometric mean (Sect. 3.4 
Exercise 10 and Sect.5.12 Exercise 6). This discovery is due to Gauss, who 
wrote down the change of variables in item (a) without giving the rather lengthy 
calculations needed, merely writing that if they are done correctly this is the result. 
Perseverance and a cool head are needed to do them correctly, but the payoff in 
item (b) is worth the effort. 


(a) Let a and b be distinct positive numbers and set 


m/2 1 
I(a,b)= / dé 
0 a?cos26 + b? sin26 
Carry out a change of variables, from 6 to ¢, where 


2a sind 


sind = a 
a+b+(a—b)sin ¢ 


Show that the change of variables leads to the conclusion 
(a, b) = I(a,, bi) 


where a, = 3(a +b) and by = Jab. 
Hint. It helps to take it in steps, by verifying the following formulas: 


2cos ¢ (aj cos? ¢ + bi sin? yy” 
a+b+(a—b)sin’ o 


(i) cosé = 


11.2 Pointwise Convergence and Uniform Convergence 335 


a+b—(a—b)sin’¢ 
a+b+(a—b)sin’¢ 
(2a cos ¢) (a + b — (a — b) sin’ ) 


(a+b+(a—b) sin? ¢) 


(ii) (a? cos? @ +b’ sin? $)'/? =a 


(iii) cos0dd = 


(b) Applying the results of Exercise 4 together with Sect. 3.4 Exercise 10, show 


that 
4 


= Tab) 


where M (a, b) is the arithmetic geometric mean of a and b. 

Note. This gives an efficient way to calculate the integral. To see why consult Sect. 5.12 Exer- 
cise 6. 

(c) Show that the period of swing of the pendulum in Exercise 3 is given by 


Tih = 20 £ _ Qn L 
~ M(A,V1—k)\V 8 MC, cos(4/2))\ g° 


6. Using Weierstrass’s test on a function series })°~° , f(x) is all about finding those 
constants M,. A useful approach can be to set M,, equal to the maximum of | f,,| 
in the interval A, if it exists and can be found. 

Determine whether the following series converge uniformly or merely pointwise, 
on the stated intervals A: 


=i 
(a) ys sinnx, A=R 
n 


n=1 


ey 
(b) = sin (=~), A =]0, ool 


oe) 
f 
(c) S22" sin (ae): A = [6, cof, where 4 > 0. 


n=1 


oe) 
@ >ox"e™, A=R 
n=l 


x 
(e) y = as WP s, andA=R. 
' nP(1 +nx-) 
n= 
Note. If 0< p< : the question of uniformity is trickier to resolve. The simplest way to 
examine this is by comparing the sum for a given x to an improper integral, a topic studied in 


Chap. 12. 
[o.e) 


(f) 2 Shera where p > 0, and A = [6, oo[, where 5 > 0. 
7. Express the notions of pointwise convergence and uniform convergence of a 
function sequence using set theory. More precisely let (f,)?° , be a sequence 
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of functions whose common domain is a set A of real numbers, and let g be a 
function with domain A. For each pair (n, k) of positive integers we define a 
subset of A by the specification 


Bre = {x € A: | f(x) — g(a)| < 1/k for all j > n}. 


(a) Show that for each k, the sequence of sets (B,,)°°., is increasing, and for 
each n, the sequence of sets (B,,)2, is decreasing. In other words show that 
Buk Cc Bnii.k and Bu k+l c Buk: 

(b) Show that f, — g pointwise if and only if for each k we have 


(c) Show that f,, > g uniformly if and only if, for each k, there exists n, such 
that 
Bye =A. 


Note. Here, at last, we get to use a union of infinitely many sets. Precisely, the union of a 
sequence of sets bear Cy, is the set of all x, such that there exists n, such that x € C,. The 
formulation of convergence of a function sequence described in this problem is the key to some 
important propositions; we can mention Dini’s theorem and Egorov’s theorem. 


8. The assumptions of Proposition 11.6 can be weakened. Assume as in the propo- 
sition that f(x) — h(x) uniformly with respect to A, but regarding the conver- 
gence of f, (x) assume only that there exists a point xo in A such that the numerical 
sequence (f,(xo))°<, converges. Prove that there exists a function g, such that 
Fnr(x) > g(x) pointwise in A and g’(x) = A(x) for all x in A. 

Hint. Show that for each x the sequence ( Sn) — Sn (x0))° , satisfies Cauchy’s 
condition. 


11.3. Power Series 


A power series is a function series of the form ear a,x". The numbers a, are 
constants, called the coefficients of the power series. The terms a,x” are meaningful 
if a, and x are complex numbers. So in our initial study of power series we shall 
assume that this is the case, and write it as pene Anz", where z is a complex variable 
and the numbers a, are complex coefficients. 

The initial term agz° is always interpreted as the constant function do (this obviates 
the need to interpret 0°). The translated series ae a,(z —c)” is called a power 
series with midpoint c. 
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11.3.1 Radius of Convergence 


Given a power series the question of interest is: for which complex numbers z is it 
convergent? 


Proposition 11.8 If the power series \~- dnz" is convergent for a given value 
Z = 29 £0, then it is absolutely convergent for all complex z that satisfy |z| < |zo|- 


Proof Since the terms of a convergent series are bounded, there exists K > 0, such 
that |a,z9| < K for all n. Let |z| < |zo| and set r = |z|/|zo|. Thenr < 1 and 


n 
v4 
|anz"| = lanZo| =| ee 


The geometric series }°~° ) Kr” is convergent, and by the comparison test the series 
yy Anz” is also convergent, in fact absolutely convergent. 


Proposition 11.9 For a given power series \ ~~ dnz", exactly one of the following 


is true: 
[o.e) 
(1) > Gnz" is convergent only if z = 0. 
n=0 
[oe 
(2) 2? nz" is absolutely convergent for all z. 


n=0 
(3) There exists a real number R > 0, such that eal ayz" is absolutely conver- 
gent for all z that satisfy |z| < R, and divergent for all z that satisfy |z| > R. 


Proof If 3-9 anz" is convergent for all z then it is absolutely convergent for all z 
by Proposition 11.8. If it is convergent for some z = zy 4 0 (which excludes case 
1), and divergent for some z = z, (which excludes case 2), then we set 


o.e) 
R=sup {il : Ganz" is convergent} 
n=0 


If |z| < R then there exists zg, such that |z| < |zo| < R and ale Anz 18 
convergent. But then °° 4 dnz” is absolutely convergent. If |z| > R then )°°° 9 dnz” 
is divergent. 


The number R is called the radius of convergence of the power series. We extend 
the notion of radius of convergence to cases | and 2 as follows. If yo a,z" is con- 
vergent only for z = 0 then the radius of convergence is 0. If )°>° 9 dnz” is convergent 
for all z then the radius of convergence is infinity, or symbolically R = oo. 

If z is on the circle of convergence |z| = R, there is no general conclusion about 
convergence. The series could be convergent for all z on the circle, divergent for all 
z on the circle, or convergent at some points and divergent at others. 
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An example of the last kind is the series )\°°, ((—1)""!/n) z". This has radius 
of convergence |. It is easy to see that it is convergent for z = 1 but divergent for 
z= —1. Actually, as we shall see later, it is convergent for all z on the unit circle, 
except for z = —1, a conclusion that requires a fairly delicate test (Dirichlet’s test). 


11.3.2 Determining the Radius of Convergence by the Ratio 
Test 


In most practical cases, the radius of convergence can be found using the ratio test. 
We shall look at three examples, that adequately convey the method. In addition the 
conclusions stated will prove useful. 


CO 
(a) The series Ve nz". 
n=0 


Here we have 
[(n + 1)z"+}| _ nti 
m ——— = lim 
noo |nz"| no n 


\z| = |zl. 


We conclude that the series is convergent for |z| < 1 and divergent for |z| > 1. The 
radius of convergence is therefore 1. 


CO on 
: Z 
(b) The series y a 
n=0 
We have 


lim 
noo 


nm>on+t] 


eae | oe. Ne = 
z"/n! 


The series is convergent for every z. The radius of convergence is therefore infinity. 


CO 
(c) The series si niz". 


n=0 
In this case a 
1 ! n 
iff teak Tiss 
noo nizn noo 


provided z 4 0. The series is convergent only when z = 0. The radius of convergence 
is 0. 
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11.3.3, Uniform Convergence of Power Series 


Now we restrict our study to real power series ey a,x", with real coefficients a, 
and real variable x. We can still speak of the radius of convergence R. It is the radius 
of convergence of the complex series )°°° 9 dnz”. 

Most of the material of this and subsequent sections carry over to the case of a 
complex variable, but require considerations of continuity and differentiability with 
respect to a complex variable that would carry us beyond the planned confines of 
this text. 


Proposition 11.10 Let R be the radius of convergence of the power series ¥\~ 9 AnX". 
Let [c,, C2] be a bounded and closed interval, such that —R < c, < co < R. Then 
the function series Spar ayx" is uniformly convergent on the domain [c, C2]. 


Proof If R is finite we choose K sothat—R < —K <cy <c,2 < K < R.IfR=c 
we choose K so that —K <c, <2 < K. For x € [c1, c2] we have |x| < K, and 
therefore 

|anx"| < |a,|K”. 


Now the series °° 9 |dn|K” is convergent since K < R. We conclude by the Weier- 
strass M-test that the series )°°°) a,x" is uniformly convergent on the domain 
[c1, co]. 


Proposition 11.11 Let \~-°.) dnx" be a power series with real coefficients and real 
variable x. Let the radius of convergence satisfy R > 0 (includes R = oo). Let 


(oe) 


f= > ass —-R<x<R. 


n=0 
Then the function f is continuous in the interval |—R, R{. 


Proof Let —R < c, <2 < R. The series is uniformly convergent on the domain 
[c1, C2]. We conclude that f is continuous in the interval [c;, c2]. But then f must 
be continuous in ]|—R, R[, since every x in ]—R, R[ lies in some interval of the form 
[c1, C2]. 


11.3.4 The Exponential Series 


The series }°~°  z”/n! is called the exponential series. We saw that it has infinite 
radius of convergence. Let f(z) = )-°°9 z"/n! for each z € C. This defines a func- 
tion of the complex variable z that has remarkable properties. 


Proposition 11.12 The function f of the complex variable z defined by 
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OS an 


f@=V=, GEO) 


n=0 —~ 


satisfies the functional equation 


fZ+tw) = fafw) 


for all complex z and w. 


Proof Let z and w be complex numbers. We calculate 


FOF) =o Po 


n=0 n=0 
oo a zk yn-k 
= » — (Cauchy product, Sect. 10.3) 
k! (n—k)! 
n=0 k=0 
7 ss i a a ) 
a n! = k\(n — k)! 
oe) n 
1 NY ko on-k 
Lae) 
n=0 k=0 


oo 
1 

= y —(z+w)" (by the binomial rule) 
n! 

n=0 


= f(z+w). 


The foregoing proof is a beautiful example of how power series can be used to 
obtain algebraic properties of functions. Note how Proposition 10.9 provides the 
justification for the second equality sign. 


Proposition 11.13 For all real x we have 


Proof Let f(x) = ban x” /n! for all real x. We have f(x +h) = f(x) f(A) for all 
x and h. Therefore 


fethy— fa). fhy—1 
ae fo. 


For h 4 0 we have 
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which is a continuous function of h, and hence has the limit 1 at h = 0. We conclude, 
by taking the limit of the difference quotient as h — 0, that f is differentiable at x 
and the derivative satisfies f’(x) = f(x), that is, f is a solution of the differential 
equation y’ = y. Hence, by Proposition 7.13, we must have f(x) = Ce* for some 
constant C, and it is easily seen, for example by putting x = 0, that C = 1. 


The famous formula of this proposition motivates the extension of the exponential 
function to complex variables. For each complex z we define 


Then exp z extends to C the exponential function e* of the real variable x. 


11.3.5 The Number e 


Now we have 


(oe) 


1 
e=exp(l)= D> = 1+ 


n=0 


1 
oboe 


1 
[73 


1 
(ea 


1 


This provides a practical method (already studied in Sect. 7.2 Exercise 6) to calculate 
e since the series converges fast. If we take terms up to and including 1/N! then the 
error is 


1 1 1 
Oil Wan was” 


1 OE ee: noe N+2 
< areal N42" Wd? ~ (N+ DNV +1)! 


For N = 10 this upper bound for the error is 0.0000000273..., whilst the sum to 
N = 10 is 2.718281801... giving seven correct decimal digits of e. 

The calculation of z is more difficult. In Sect. 8.1 Exercise | the value 1979/630 
was obtained which is short of x by around 3/10000. Archimedes gave the bounds 


The difference between these bounds is around 1/500. In fact the lower bound is 
accurate to around 7/10000. Soon we shall exhibit a series that can be used to calculate 
x with arbitrary accuracy. 
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11.3.6 Differentiating a Power Series 


The series }°>° (1 + Dan412", or equivalently }°>° , na,z"~', is obtained formally 
by differentiating the series °°, a,z” term-by-term with respect to z. We shall 
call it the derived series of the series )°>° 9 anz". Whatever may be the relationship 
between the sums of the two series, we can study the derived series in its own right. 


Proposition 11.14 The derived series \~~ )(n + Van412" has the same radius of 
convergence as the original series \~ >) dnz". 


Proof Let R be the radius of convergence of the original series }°°° 9 anz”, and R’ 
the radius of convergence of the derived series °° y(n + 1)dn412”. 

Suppose first that 0 < |z| < R and choose Zo, so that |z| < |zo| < R. Then the 
series )>° 9 dnz{ is convergent, so that there exists K, such that |a,z(| < K for all 
n. Now we find 


aaa n+l 


(n+ 1) |an412% 
Iz| 


n+l K 
< (n+ 1) 
Iz| 


z z 


|(n + 1an412"| = Es 
0 


Zo 
Since |z / zo| < 1, we can refer to example (a) under “Determining the radius of 

convergence by the ratio test” and conclude that the series 
[oe 


Yo+n]= 


n=0 


n+l 


is convergent. Hence, by the comparison test, the series 


(oe) 


DIG + Danyiz” 


n=0 


is convergent if |z| < R. But this means that R’ > R. Note that if R = 0 the above 
arguments are invalid but then it is trivial that R’ > R. 

Next we suppose that |z| < R’. The series yy |(n + 1)ay412z"| is now conver- 
gent and for n > 1 we have 


’ 


lanz”| < |z| |nanz” "| 

so that the series )°°° , a,z” is also convergent. This means that R > R’. Again if 

R’ = 0 the deduction is invalid but the inequality is then trivial. 
Putting the inequalities together gives R = R’. 


Proposition 11.15 Let the power series \>~ 4 anx" have real coefficients and let R 
be its radius of convergence. For all x in the interval |— R, R[ we define the function 
f@)= ar aynx". We have the following conclusions: 
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(1) The derivatives f“ of all orders exist for every x in|—R, R{. 
(2) The formula 


oe) ioe) ' 
n. n—k 


f(x) = Donn — 1)...Q2 — k + Lagx”* =) ———_ aay x 


n=k n=k (n > k)! 


holds for every x in|—R, R|. The equality is formally obtained by differentiating 
the series ee a,x" term-by-term, k times. 
(3) For eachn we have an = f™ (0)/n! 


Proof However often we differentiate formally term-by-term we obtain a power 
series with the same radius of convergence R. It therefore suffices to show that f’ 
exists and equals )°°° , na,x”"~! for |x| < R. The rest is just repetition. 

Let0 <r < R. The series )°~ , na,x"~' is uniformly convergent on the domain 
[—r, r]. We conclude by Proposition 11.7 that f is differentiable for —r <x <r 
and f’(x) = bee na,x"—'. This holds for all r that satisfy 0 < r < R. This means 
that the equation f’(x) = )°°°, na,x"—! must hold for —R < x < R. 

By repeating this argument we find 


(oe) (oe) 


7a) = Yin —1)...n—k + VDayx"”* = y- 


n=k n=k 


n! oy 
co aa 


for all x in ]—R, R[. Finally, setting x = 0 gives f (0) = k! ag. 


If a function f has derivatives of all orders at a point at x = 0, we may form the 
power series (which merits a display because of its great importance): 


s + (O)x". 
n 


n=0 ~ 


Whether or not this series has a positive radius of converges (and if it does it need 
not converge to f(x) in its interval of convergence), we call it the Maclaurin series 
of f. Proposition 11.15 shows that if a power series }°°° 5 a,x" has a positive radius 
of convergence, then it is the Maclaurin series of its sum function. 

The permissibility of differentiating power series term-by-term makes them into 
a powerful tool for investigating transcendental functions. As function series go it is 
quite a luxury; one is apt to forget that term-by-term differentiation is not generally 
permissible for function series, and not even for commonly used function series such 
as Fourier series, without some extra conditions. In the next section we shall use 
power series to study the elementary transcendental functions. 


344 11 Function Sequences and Function Series 


11.3.7 Exercises 


1. Determine the radius of convergence of the following power series: 


“di 
(a) > zn 
n=0 as I 
fore) 
gn 
(b) > zn 
n=0 n+l 
fore) 
(=1)" 5 
© ec 
m=0 m! 


Note. This type of formulation, where not all powers are displayed, is common. Here only 


the even powers are displayed; the odd powers are understood to have the coefficient 0. In 


3k+1 


other cases only odd powers may be displayed, or powers of the form z where k is an 


integer. The variations on this are many. 
[o.e) 


(2+n)" 
(d) —_—.2" 
2 n+1 
oe) 2 n 
(e) > came 
n=0 . 


—1)""1(2n)! 
© (—1)"@n) 


L~ (nln — 122" 


2. (©) Prove the following formula for the radius of convergence R of a power series 


ee) n. 
ee nx: 
1 


R = —————_... 

lim sup, , 46 |@n|* 

The formula is interpreted as R = 0 when the denominator is oo and as R = 00 
when the denominator is 0. 

Hint. Consult Sect. 10.2 under “Extended forms of the ratio and Cauchy’s test’. 
Refer to Sect. 3.11 for an account of limit superior. 

3. () Use the formula obtained in the previous exercise to give a short proof that 
the derived series pare (n + 1)ay+1x" has the same radius of convergence as the 
original series °° 4 ayx”. 

Hint. Prove and use the rule 


lim sup c,d, = (lim c,)(lim sup d,), 
n—>0o A> OO noo 


valid on the assumption that c, converges to a positive limit. 
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11.4 The Power Series of Common Elementary Functions 


The power series expansions of sin x and cos x were among the spectacular results 
of the early decades of calculus. They led to the biggest advance in the practical 
calculation of the circular functions since Ptolemy of Alexandria. The connection 
between the exponential function and the circular functions followed and was seen 
to justify fully the controversial introduction of the imaginary unit 7. 

The binomial series was studied by Newton, who conjectured its sum by extrapo- 
lating from the case of a positive integer exponent. He probably found further support 
for the conjecture by using the series to calculate some square roots. A reasonably 
satisfactory proof was lacking until Abel gave one. 


11.4.1 Unification of Exponential and Circular Functions 


The power series defining the exponential of a complex number 


0° 
» 1 n 
expz = mes 
n\ 
n=0 


is convergent for all complex z. We have seen that the function exp z satisfies the 
functional equation exp(z + w) = exp z exp w for all complex z and w. We also know 
that for real x we have exp x = e*. This motivates the notion of e raised to the power 
z, defined by 

e* := exp Zz, 


and justifies the use of the name exponential. 

Now we consider the function e!* of the real variable x, the restriction of e% to the 
imaginary axis. We know that i27 = —1,i* = —i andi* = 1. In general i7” = (—1)”, 
i2"+1 — (_1)"j, We therefore have 


n=0 
ad * 1 
-2m 2m -2m+1/.2m+1 

= a ee 

Gm 7 Gm +i 

0° m 0° m 
= - (=) 2m +i LS a) tnt 

5 (2m)! an (2m + 1)! : 


This expresses the real and imaginary parts of e’* as power series. Let us write 
u(x) = Ree”, v(x) = Ime”. For all real x we have 
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oe) 
- (—1)” =. x? x4 x x8 x10 
MO) = Do Omi =i ae ge a ge 
oe) 
_ (-—1)” id. x x3 x5 x! x? xl 
ie Sara = arts 7 tom 


We can compute the derivatives of u and v by differentiating the series term-by-term: 


fej x x x? é x’ x? - xll v(x) 
~~ "3r Sr Wo 
a ee ee | 
v(x)=1 + + u(x) 


+ 
II 


and from this we conclude 
u"(x) = —u(x), v’(x) = —v(Qa). 


We see that both u and v satisfy the differential equation y” + y = 0. Hence u(x) and 
v(x) are each of the form A cos x + B sin x where A and B are constants (see Propo- 
sition 7.2). But from the power series we see that u(0) = 1 and w’(0) = 0, giving 
u(x) = cos x, and that v(0) = 0 and v’(0) = 1, giving v(x) = sin x. To summarise, 
we have the following conclusions. 


Proposition 11.16 For all real x the following formulas are valid: 


(oe) 


1 = AUS, 2m = 
(1) cosx a Qm)! ay + Al 6! + 8! 10! + 
CO 
may oo ak ee 
D) as NAY 2m+1 = oye 
2) Se Xm + Dr ia s aio m 


(3) e* =cosx +isinx. 


The third formula in the proposition (sometimes called Euler’s formula) tells us 
that e’* parametrises the unit circle. Setting x = 2 we derive the most beautiful 
formula in mathematics, 

e'™ +1=0. 


The five most important numbers of analysis, 0, 1, 7, 2 and e are here quite unex- 
pectedly, and one might even say poetically, intertwined. 


11.4.2. The Binomial Series 


The series in question is 
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.a(a—l1)..(a—n+1) , 
n! os 


n=0 


The number a is a constant, which could be complex. The term for n = 0 is always 
interpreted as 1. Compare the binomial rule for a positive integral exponent m: 


m 


(4x)"= > mim — — —nt+ Y) on _ 3 (™\x" 
: n=0 


n=0 


and we see why it is a plausible conjecture that for real a the binomial series sums 
to(1+x)*. 

If a is not a natural number then the radius of convergence of the binomial series 
is 1. We only have to apply the ratio test: 


a(a—1)..(a—n) ,,, fala—1)..(a-n+1) ,  a-n 

z 2 = z 
(n+ 1)! n! n+1 

Hence the binomial series converges for |z| < 1 and diverges for |z| > 1. 


Proposition 11.17 Leta be areal number. For all x in the open interval —1 < x < 1 
we have 


Gere ye Ia =n +1)» 


! 
Nn: 
n=0 


Proof Let f(x) be the sum of the series for those x that satisfy |x| < 1. We shall 
show that (x + 1) f’(x) — af (x) = 0. 
Set 


_ a(a —1)...(a—n+1) 


n= 


n=1,2,3,... co=1, 
n! 


so that f(x) = pa cnx", for —1 < x < 1. We saw, in the calculation of the radius 


of convergence, that 
Cnt41 aN 


Ch n+l? 


that is 
(n+ len41 = (a—N)eq. 


Differentiating the series term-by-term, and using this, we find 
[o.e) 
fiw~= Sone 
n=1 


(oe) 
= ot Deny” 


n=0 
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CO 
= XC —N)Cyx" 
n=0 


CO CO 
=a ) Cyx" — x ) nCpx"! 
n=0 n=1 


= af (x) —xf'(). 


We conclude that (x + 1) f’(x) — af (x) = 0, thatis, f is a solution to the differential 
equation 
(x + ly’ -—ay =0. 


To see why this implies that f(x) = (1 + x)* we observe that 


d 
= (A +2) F@) = +2yF'@) — ad ty F@) 
X 


= (1+x)-* "(+ x) f'@) — af @)) 
= 0, 


and conclude that (1 + x)~“ f(x) is a constant C for —1 < x < 1. By setting x = 0 
we find that C = 1. 


The product a(a — 1)...(a — n + 1) that appears in the binomial series can often 
be tidied up. Of course one way is to write it as IIj_,(a — k + 1). There are others 
that drastically alter its appearance, but are quite common. Consider the series 


Games Coosa aD 


n=0 


We rewrite the product in the numerator, noting that there are n factors: 


( a 2) (2 n) =( py Eden = W) 
2 D 2 Qn 


Here we have a new product of n factors, increasing in steps of 2. We insert the even 
numbers, above and below the line, to obtain 


1.2.3....(2n — 1)(2n) (2n)! 
1)” = (-1)" : 
CD 2” 2.4...(2n) —) 22" n) 
This produces the series 
[o.e) 
2n)! 
—1/2 _ Ae ( n 
(+x) FO aa 


n=0 


11.4 The Power Series of Common Elementary Functions 349 


not easily recognisable as a binomial series. 

The binomial series makes sense when a is a complex number. This provides 
the opportunity to take up again the story of the power function x“, prematurely 
pronounced concluded in Sect.7.2, and extend it to complex powers a. In what 
follows x will be a positive real variable but a may be complex. We define 


x" :=exp(alnx), (x > 0). 
Exercise Show that complex powers obey the laws of exponents: 
ath = x4 x? (x7? = xa 


where x > 0 and a and b are complex numbers. 


Now we can calculate 


d 
a = ag exp(a In x) 


a 
= — exp(alnx) 
x 


= aexp(—Inx) exp(alnx) 


= aexp ((a -—1) In x) 
=ax*"!, 

Exercise Justify the second equality sign by differentiating the real and imaginary 
parts of the function. 

Although we took a to be a real exponent in the proof of Proposition 11.17, there 
is nothing in the proof that does not work if a is complex. All we need is the formula 
giving the derivative of x“ as ax“~', which we obtained just now for a complex 
exponent a and a real variable x. In this text we do not go as far as considering 
differentiation with respect to a complex variable. 


Exercise Verify that term-by-term differentiation is valid for a power series 
ye 9 nx” with complex coefficients within its interval of convergence —R < x < R. 


11.4.3 Series for Arctangent 


The case a = —1 of the binomial series gives us the formula 


loo) 


= Yi)", ae <4, 


n=0 


1 
14+ x? 
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If 0 <r < 1 the series converges uniformly for —r < x <r. We may therefore 
integrate term-by-term from 0 to x and find 


COO ay oo ik 
aretanx = Yo f (-1y'Prdt = 0 ee, -l<x<l. 
n=0 0 n=0 an+ 1 


It should be clear that this holds for all x in the interval ]—1, 1[ since we can move 
r as near to | as we wish. But we should also notice that the series is convergent 
for x = 1 (by Leibniz’s test), although the series for 1/(1 + x7) is divergent when 
x=1. 

It is natural to ask whether arctan 1 is equal to gan (—1)"/(2n + 1). The argu- 
ments given above do not settle this. If the answer is yes, then we obtain a series for 


i, namely, 
tt. Se (—1)" 
4 fone. 


iS) 


11.4.4 Abel’s Lemma and Dirichlet’s Test 


The answer to the question posed at the end of the last section will soon be obtained by 
the use of Abel’s theorem on power series. We postpone this topic to the next section 
and turn instead to Abel’s lemma. On the face of it, this is a diversion, referring as 
it does to numerical series (rather than function series); so it might have fitted better 
into the previous chapter. However, its main use is in proving Dirichlet’s test, an 
important application of which is to study the convergence of power series on the 
circle of convergence. And that is exactly where we have arrived in the discussion 
of power series. 


Proposition 11.18 (Abel’s lemma) Let (a,)°°., be a complex sequence and let 
(by )°°., be a decreasing sequence of positive, real numbers. Set 8, = -¢—, ax, and 
assume that there exists M > 0, such that |s,| < M for alln. Then 


» aby 


k=1 


< Mb. 


Proof We have (as an exercise, the reader might try writing the calculation using 
>--notation throughout) 


Yo debe = s1b1 + (82 — 81)b2 + (53 — 52)B3 +++ + Sn = Sn—1)Pn 
k=1 


= sy (dy Fo by) a 52 (by a b3) ae + Sy—1(Dp—1 re by) + SnDp. 
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Since by — bys, > 0 for all k, we find 


n 


> arb; 


k=1 


< M( (by ~ by) + (By = bs) +++ + Opt = By) + By) = Mby. 


Abel’s lemma leads to a convergence test for complex series that does not test 
for absolute convergence. It is sometimes useful for proving that a power series 
converges at a given point on the circle of convergence, where, naturally, absolute 
convergence may fail. 


Proposition 11.19 (Dirichlet’s test) Let (a,)°°., be a complex sequence, let (bn)°° 
be a decreasing sequence of positive real numbers, and assume that limy-+oo bn = 0. 
Set sy = a a, and assume that there exists M, such that |s,| < M for alln. Then 
the series pa ayby is convergent. 


Proof For integers m and € such that m < £ we have 


By Abel’s lemma we now find, for m <n, that 


£ 


Ya 


k=m 


~ aby 


k=m 


< (sp ) bm < 2M, 
t>m 


Since lim»,-s 56 Dm = 0 the series ae a,b; is convergent by Cauchy’s criterion 
(Proposition 10.4). 


A simple case of Dirichlet’s testis to take a, = (—1)”~ | since then s, is alternately 
1 and 0. We obtain Leibniz’s test, that )77° ; (— 1)*b; is convergent if b;, is decreasing 
and tends to 0. 

A popular choice that gives new conclusions is to take a, = 7", where 7 is a 
complex number such that |7| = 1, but 7 ~ 1. Then we have 


n+l _ 
n | < 2 
oe In — 1| 


Now suppose that the power series °°. b,z” has real coefficients and radius of 
convergence R < oo, and assume that the sequence (b,, R")°°) is decreasing and 
tends to 0. If z is on the circle of convergence then z = Rn where |n| = 1, and we 


can write 
CO CO 
ae = Soy R")n". 


n=0 n=0 
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We conclude that the series converges if 1 # 1, that is, the power series }°*° 4 byz” 
converges on its circle of convergence except possibly at z = R. 


11.4.5 Abel’s Theorem on Power Series 


We come to the result of Abel postponed from a previous section. It is also commonly 
known as Abel’s limit theorem. 


Proposition 11.20 (Abel’s theorem) Let the power series \~>° 9 dnx" have real coef- 
ficients and let its radius of convergence R be finite. Set f (x) = \~-.) anx" for 
—R <x < R. Assume that the series \~~ . a, R" is convergent and let L be its sum. 
Then lim,-.r- f(x) = L. 


Proof We first treat the special case when R = | and °°.) a, = 0. We must show 
that lim,.;— f(x) = 0. 
Let s, = per ag, SO that limy,-. 5 Sy = 0. For 0 < x < 1 we have 
f(x) = (l= x) = x) eg an x” 
= 31 OD ar a) ( nao dnx") 


= (1-x) paar Sx" 


where the product in the second line is computed as the Cauchy product. 
Let ¢ > 0. Choose N, such that |s,| < ¢ for alln > N. Then for 0 < x < | we 
have 


+ ex, 


If@)| < d—x) 


N-1 N-1 
) Spx” ) Spx” 
n=0 n=0 


The right-hand member of the equality has the limit e when x — 1—. Hence there 
exists 6 > 0, such that it is below 2¢ for all x that satisfy 1 — 6 < x < 1. For such x 
we have | f(x)| < 2e. This shows that lim,_.;- f(x) = 0. 

Finally the general case. Let s = )~°°°., a, R”. We note that the power series 


+e(l—x) )ox"=(1—x) 
n=N 


(ao — 8) + D> (a, R")x" 


n=1 


satisfies the conditions of the special case treated first. We conclude that 


oe) oe) 
im (w —s)+ Yan's") = (ap — 5) + ae? 


n=1 n=1 
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which is equivalent to 


CO CO 
lim ) Anx"” = ) a,R". 
x—R- 

n=0 n=0 


We can give some nice applications of Abel’s theorem. 


(a) We saw that 


(oe) 


(=D ont 
arctan.x =)» ———_y" , -l<x<l 
mar 2n+1 


and that the series converges for x = 1 by Leibniz’s test. We conclude that 


bi a (= 1" 
4 arctan a a + r 


This is sometimes called Gregory’s series for z. 
(b) We know that (1 + x)~! = °° ,(—1)"x" for —1 < x < 1. Integrating we find 


(br! 
Ind+x)= Dre Awe 1; 
n=1 


The series is convergent for x = 1. We conclude that 


lo) 

—] n—1 

In2= 5° ey 
n=1 


n 
These are beautiful results. But neither series is very good for practical computa- 


tion since they converge so slowly. This is not surprising as we are operating on the 
circle of convergence. 


11.4.6 Exercises 


1. Show that e% = e* for all complex z. 
2. Show that |e?| = e®°¢ for all complex z. 
3. Show that if @ is real then 


. 6 
je — 1 = 2|sin =|. 
2 
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. Show that the set of all solutions in C of the equation e* = 1 consists of the 


numbers 27rni where n is an integer, positive, negative or 0, in short the integer 
multiples of 277. 


. Show that exp z is a periodic function of the complex variable z with basic 


period 27:7. In other words show that exp z = exp(z + 277) for all z, and that if 
exp z = exp(z + T) for all z, then T is an integer multiple of 277. 


. Show that the range of exp is the whole complex plane with the exclusion of the 


point 0. 


. Recall the principal logarithm of a complex variable, Log z, defined in 9.1 for 


all z that are not negative real numbers or 0. The excluded set is the interval 
]—ow, 0] considered as a subset of C. Show that Log and exp are inverses to 
each other if we restrict the domain of exp suitably. 

More precisely, let S be the strip {z € C : —mwi < Imz < zi}. Show thatifz € S$ 
and w € C\]—oo, 0], then w = exp z if and only if z = Logw. 


. Prove that if 6 is not an integer multiple of 27 then 


a 1 


a kid __ =a 


What is the correct formula if 6 is an integer multiple of 27? 


. Let 


C, (0) = Y “cos ke, S(@)= > sin kO 
k=0 k=0 


for all real 6 and positive integers n. 


(a) Prove that if @ is not an integer multiple of 27 then 


cos 5nd sin 5(n +1)0 sin 5 5no sin 5(n + nes 


C,(0) = » S(O) = 


sin 50 sin 50 
What are the correct formulas if 6 is an integer multiple of 27? 

(b) Show that for all 6 > O there exists K > 0, such that for all n, and for all 0 
in the interval 6 < 6 < 27 — 6, we have 


IC,(@)|< K and |S,(@)| < K. 


(c) In the previous item K depends on 6 and we cannot replace 5 by 0. Show 
that 
lim sup C,(@) = im sup S,(@) = oo 


NO (<9 <2n © 0<0<2n 


1 1 
. (a) Derive a power series for the function 5 In (=) from the power series 
—x 


for In(1 + x). 
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(b) Setx = i in the power series obtained in the previous item and compute In 2. 
Use enough terms to get three correct decimal digits and for this purpose 
estimate the tail of the series. You should only need a very few terms. 
Compare with the series )°™, (—1)""'/n = In2. How many terms are 
needed of the latter series to get In 2 to 3 decimal places? 

(c) Setx = 5 and compute In 3. 

(d) Find a nice approximation to In 10 (and recall Sect. 7.2 Exercise 1). 


. Show that the series al /(n + 1))z”, which has radius of convergence 1, 


converges for all complex z that satisfy |z| = 1, except for z = 1, where it 
diverges. 


. Using the binomial series, obtain series for the following functions, and tidy 


them up after the fashion of the text. 
(a) (1+x)'? 


ib) <d4+2%"7" 
(c) (—-x’)7'”. 


. (Q) Let a be a real number, but not a natural number. You are asked to give an 


exhaustive description of the convergence, or otherwise, of the binomial series 


ae Dake); 
kl 


k=0 


at the endpoints of its interval of convergence. For which values of a (excluding 
here the natural numbers) is the following true of the series? 


(a) It is absolutely convergent for x = 1. 

(b) It is absolutely convergent for x = —1. 

(c) It is conditionally convergent for x = 1. 

(d) It is conditionally convergent for x = —1. 

(e) It is divergent for x = 1. 

(f) Itis divergent for x = —1. 

Hint. Let c, be the binomial coefficient. A good place to start is from the 
ratio Cn+1/Cn. For absolute convergence Gauss’s test is useful. For conditional 
convergence a useful first question is: for what a is the sequence of absolute 
values |c,| decreasing for sufficiently large n? And for what a does |c,| tend 
to 0? 


. Calculate some terms of the Maclaurin series (see Sect. 11.3 under “Differentiat- 


ing a power series” for the definition of Maclaurin series) for tan x, for example 


up to the term a3x°. 


. In the previous exercise the going gets tougher when higher powers are needed. 


Obtain a simple recurrence formula for the coefficients in the Maclaurin series 
for tan x, by showing that tan x satisfies the differential equation y’ = 1 + y’, 
and using Leibniz’s formula for the nth derivative of a product. 
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17. 


18. 


19. 


20. 
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Using only the series for cos x and sin x and not relying on knowledge about 
these functions, prove that cosx has a lowest positive zero, and that it lies 
between ./2 and /3. 

Note. In texts that use the power series of sin x and cos x as definitions of the circular functions, 
the result of this exercise is used to define the number z as twice the lowest positive zero of 
COS xX. 

Suppose that the function f(x) has derivatives of all orders at x = 0, and that 
f() 4 0. Show that 1/f has derivatives of all orders at x = 0. 

Suppose that f has the Maclaurin series )°°°) a,x”, with ao 4 0, and that 
1/f has the Maclaurin series }°°°, b,x”. Show that the coefficients b, can be 
calculated from the recurrence relations 


ie 1 
bn =-—)> abn (Nn =1), bo = —. 
ag fai ag 


This holds irrespectively of whether either of the two Maclaurin series has a 
positive radius of convergence. 

Suppose that f has the Maclaurin series )°™° 9 dnx", with ay 4 0, and that 1/f 
has the Maclaurin series }°>° 9 b,x”. Suppose further that the series )°7° 9 an.x” 
converges to f(x) in an interval ]—R, R[ centred at 0. It is natural to ask whether 
the series )°°° , b,x” converges to 1/f (x) in some interval centred at 0. This 
question can be answered most satisfactorily by means of complex analysis. 
However, using only methods of this text, one can produce an interval in which 
yo bnx” converges to 1/f (x), though it may fall far short of the largest one. 
Assume that 0 < r < R and set M = max;3 la, |r*. 


(a) Show that 


n—-1 

M 

Ibn |r” < al Yo ldelr*é, (KE 1). 
k=0 


(b) Deduce that 
M wy 
lb,|r™ < —,(1+ ; (k21). 


|ao|? |ao| 
(c) Deduce that the series baer b,x" converges if 


|ag|r 


a= 
M + |ao| 


and has the sum 1/f (x). 
(d) Obtain some lower estimates for the interval in which tan x can be repre- 
sented by its Maclaurin series. 


(a) Obtain the power series representation 
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21. 


22. 


23. 


24. 


(oe) 


; (Qn)! x2nt! 
arcsin x = ys 2(nly2 In +1 
n=1 


for—l<x<l. 
(b) (}) Show that the representation of arcsin x obtained in the previous item 
remains valid at x = +1. This gives two further series for zr. 
Let the series )°7° , ax be convergent. Without using Abel’s theorem on power 
series, show that the power series }°°, a,x* is uniformly convergent with 
respect to x in the closed interval [0, 1]. 
Hint. Study the tail )°°-.,, a,x*, using a treatment similar to that in the proof of 
Abel’s theorem. 
Note. This gives another proof of Abel’s theorem. 
Show that the Maclaurin series for f(x) := (1 — x)? converges uniformly to 
Ff (x) on the closed interval [0, 1]. 
Hint. By the previous exercise it is enough to show that the Maclaurin series 
converges at x = 1. If you haven’t read the nugget on Gauss’s test you might try 
to prove, since the terms plainly alternate in sign, that the coefficients tend to 0. 
Note. The result shows that the function ./x can be approximated uniformly by polynomials in 


the interval [0, 1]. This is the first step in proving the Stone—Weierstrass theorem, a very general 
result of metric space theory that includes as a special case the Weierstrass approximation 
theorem, which tells us that a continuous function can be approximated uniformly in a bounded 
interval by polynomials with arbitrary accuracy. 

The convergence of the series for arctanx and In(1+ x) at x =1 can be 
obtained, without using Abel’s theorem, by keeping a close eye on the remainder 
terms in the series being integrated. More precisely, write 


mole ig =e = a 


and derive 


-1 
(—1)*x k+1 x (—1)"t" 
Indi +x) = a tal +f es oo 


k= 


valid for all x > —1. Show that for x = | the remainder tends to 0 as n > ow. 
Carry out a similar analysis for arctan x. 

Gregory’s series for 2 converges too slowly to be practical as a way of computing 
zt. Some games with the addition formula for arctangent give better series. Prove 
the following formulas, and use the second to compute zr. 
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(a) 
(b) 
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1 
= arctan 5 + arctan — 


AAAS 


= 4arctan — — arctan —_. 
5 239 


In the remaining exercises in this section, we look at some consequences of Abel’s 
lemma and Dirichlet’s test. One benefit is a test that can decide uniform convergence 
of a function series that is not absolutely convergent. 


25. 
26. 


27. 


28. 


Show that the series }°°~ , sin(nx)/n is convergent for every x. 

Prove the following useful variant of Dirichlet’s test which imposes stronger 

conditions on a, but weaker ones on b,,. Suppose that the series ae ax (which 

may be complex) is convergent and that (b;,)72, is a monotonic and bounded 
. CO . 

sequence of real numbers. Then the series }° 7" , acb, is convergent. 

Let (a,)°° , and (b, °°, be complex sequences. Let s, = ae a, foreachn > 1. 

Rewrite the calculation in the proof of Abel’s lemma to give the formula: 


n 


> andi = Spbayi — Se sei — Dx). 
= 


k=1 


Deduce from this another variant of Dirichlet’s test: if }°7°., ax is convergent 
and ae (bg41 — by) absolutely convergent, then ya a,b; is convergent. 
Note. This includes the result of the previous exercise as a special case, but is stronger; for 
example, by, can be complex. For an example of its use see Exercise 32. 

Dirichlet’s test is the basis of a test for uniform convergence, which is sometimes 
useful for function series )°>°_, f, (x), that are not absolutely convergent for some 
values of x. For such series the Weierstrass M-test cannot succeed. Prove the 
following proposition. The proof is the same as for Dirichlet’s test but functions 
replace numbers. 


Let (uy)p2, and (v,)°°, be function sequences on the same domain A. Sup- 
pose that there exists M, such that for all n and for all x in A we have 
| ote1 Ve(x)| < M. Suppose further that the functions u, are real valued, that 
u,(x) decreases with increasing n for each x in A, and that u, tends uniformly 
to 0 on A. Then the series \~° , Un(x)Vn(x) converges uniformly with respect 


toxin A. 
There is a variant of this, similar to Exercise 26: 


Let (un)p°., and (v,)P°., be function sequences on the same domain A. Assume 
that the function series ¥~?~., ve(x) is uniformly convergent for x in A. Suppose 
further that the functions u, are real valued, that u,(x) decreases with increasing 
n for eachx in A, and that there exists K > 0, such that |u,(x)| < K foralln and 
for all x in A. Then the function series ¥°~ , Un(x)Vn(x) converges uniformly 
with respect to x in A. 
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29. 


30. 


31. 


32. 


Let f(x) = pea sin(nx)/n. Show that for all 6 > 0 the series is uniformly 
convergent with respect to x in the interval [5,27 — 5]. Conclude that f is 
continuous in the open interval JO, 2z[. 

Hint. Use Exercise 9 and the previous exercise. 

Note. The function f is actually discontinuous at multiples of 277 and exhibits a “saw tooth” 
pattern. The series is a basic example of a Fourier series representation of a discontinuous 
function. 

Let s be a complex number. Show that the series )°°°_, n~* is absolutely con- 
vergent if Res > 1.. 

Note. The function ¢(s) = yet n~* is the Riemann zeta function and was referred to in 
Sect. 10.3 Exercise 2. It plays a really important role in number theory. In this exercise we have 
defined it for Res > 1 but it can be extended to the left of the line Res = | in an essentially 
unique way. 

Series of the form + a,n~*, with complex coefficients a,, are known as 
Dirichlet series. They generalise the series of the previous exercise. Normally 
they are studied with complex s, but in this exercise we restrict ourselves to 
real s. 


(a) Suppose that the series )°°° , a, is absolutely convergent. Show that for 
every s > 0 the series )°~ , d,n~ is absolutely convergent. 

(b) Suppose that the sequence of partial sums )~7_, ax is bounded (in particular, 
this is the case if the series )~°° , a, is convergent). Show that forevery s > 0 
the series )-°°_, a,n~* is convergent. 


Extend the conclusions of the previous exercise to complex s. The main differ- 
ence lies in the test needed for item (b). 


(a) Suppose that the series )°°° , a, is absolutely convergent. Show that for 
every complex s such that Res > 0 the series )°°° , a,n~* is absolutely 
convergent. 

(b) Suppose that the sequence of partial sums }~>7_, a, is bounded. Show that for 
every complex s such that Res > 0 the series }°~ , ann~* is convergent. 
Hint. Use the fact that if Ret > 0 the series )°~ , (n~“ —(n+ 1)-*) is 
absolutely convergent. This can be seen by computing the limit 


~ x t-@wt+i1)7 
lim ————_—_——., 
x00 xt 
where ¢ is complex. Then use the version of Dirichlet’s test given in Exer- 
cise 27. 
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11.5 (0) Summability Theories 


There is an old puzzle about the light switch that is turned on at one minute to 
midnight, off at half-a-minute to midnight, on again at a quarter-of-a-minute to 
midnight, and so on, always halving the time between successive switchings. We 
ask: is the light on or off at midnight? 

It seems we are asking whether the infinite sum 


CO 
Yop! s=1-14+1-141-14--- 
n=1 


is 0 or 1. Of course the series is divergent; having a correct definition of limit seems 
to dispense with the question as being meaningless. However, mathematicians are 
not so ready to give up, and have invented summability theories to shed light on this 
question. 

Asummability theory is a procedure for assigning values to infinite sums }°°° | dn, 
that assigns the correct value to convergent series, assigns a value to some divergent 
series, and satisfies certain natural rules. We want the sum to be a linear operation, 


that is, 
CO CO CO 
Yi (ean + Bon) = 0 Yo an +BY bn, 
n=1 n=1 n=1 


and we want it to behave naturally with regard to tacking on extra terms at the front, 


where b} = a and by4, = a, forn = 1, 2,3,... 
These rules alone fix the value of }°°° ,(—1)”~! in any summability theory, for 
calling the sum s, we have 


[oe] (oe) 


l-s=1-)-)"!= SOC)" =s 


n=1 n=1 


so that s = 5. Quite a sensible conclusion which suggests that the light is equally 
likely to be on or off. 
There are two principal summability theories that are commonly seen. 


Cesaro summation 


Also known as (C,1)-summation or summation by arithmetic means. We let 
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The Cesaro sum of the series pear ay is the limit lim,-,.. O,, provided the limit 
exists. By Proposition 3.23 this assigns the correct value to convergent series. 
One can take this further by considering the sequence 


n 
o = Z ) Or 
n n ’ 
k=l 


of arithmetic means of the sequence (o,)°°, of arithmetic means. The limit 


lim,+o0 6), if it exists, is the (C,2)-sum of yo a,x. In this way we can produce 
a whole scale of summability methods, (C,m)-summability, for m = 1, 2, 3, ... 


Abel summation 


We consider the power series }°°° . a,x”. If its radius of convergence is greater than 
or equal to | we set F(x) = se a,x" for |x| < 1. The limit, lim,_,;_ F(x), if it 
exists, is called the Abel sum of the series }°°° 9 dn. By Abel’s theorem on power 
series this assigns the correct value to convergent series. 


11.5.1 Abelian Theorems and Tauberian Theorems 


It may be that a series, whose normal convergence status is unknown, can be shown 
to be Abel summable, or Cesaro summable. We cannot of course deduce that the 
series in question is convergent in the normal sense from either of these facts, as the 
case of }°>° ,(—1)"~! shows. But an additional condition imposed on the sequence 
(a,)°°., may suffice to make the deduction. Many such conditions are now known. 
The following result is their prototype. 


Proposition 11.21 (Tauber’s theorem) Suppose that the series \~~.9 a, is Abel 
summable with sum s, assumed to be a finite number. Assume that Vimy. 59 NAn = 0. 
Then the series is convergent in the normal sense, with sum s. 


Proof Let F(x) = > p29 a,x* for 0 < x < 1. The assumption of Abel summability 
means that lim, .;_ F(x) =s. Let s, = > a,. For each n and each x in the 
interval ]0, 1[ we have 


n [oe] 
Sp —8 = F(x) -s + oa (l—x*) + a apx*. 


k=0 k=n+1 


Since 0 < x < 1 we have 
tHe] (= 44-3 4s $2) Sh 4), 


And fork >n+ 1 we have 1 < k/(n + 1). Therefore 
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n 1 [o.e) 
lm — 8] <|F(@) —s|+(1—x) }o hla] + —— > Klagla*. 
k=0 n+l k=n+1 


Let e > 0. Since lim,_,4. Nd, = O there exists N, such that 
1 n 
nla,| < € and —— klaz| < 
lan = d lax 


for alln > N. Then for 0 < x < landn > N we have 


€ 

S, —8| < |F(x)-—s| +d —x)™m-+ le + ———___.. 

Iie 51 S1F@) — 814-4 De + oa 

In particular, putting x = x, = 1 — 1/(n + 1), we find for n > N that 
[Sn — S| < |F Qn) — s| + 2e. 


Letting n > oo we obtain 


lim sup |s, — s| < 2e 
noo 


and since é is arbitrary we conclude limy-. oo 5, = 5 as required. 


In general a Tauberian theorem enables one to conclude that a series that is 
summable by a given summability method is also convergent in the usual sense, 
given some additional condition (a Tauberian condition) imposed on the terms. An 
Abelian theorem enables one to conclude that a prospective summability theory 
assigns the correct value to series convergent in the normal sense. 


11.5.2. Exercises 


e 


. Show that both the Abel sum and the (C,1)-sum of 0° ,(—1)""1 is $. 

2. Show that a series that is (C,1)-summable is also Abel summable, and has the 
same sum by both methods. 

Hint. Copy the proof of Abel’s theorem but using (1 — x)~? instead of (1 — x)7!. 
This might suggest the more general result that (C,m)-summability implies Abel- 
summability, and a strategy for proving it. 

3. Prove the following Tauberian theorem. Let a, > 0 for all n and suppose that the 
series )“°° a, is Abel-summable. Then the series is convergent in the normal 
sense. 

4. Suppose that the series }°> 9 a, and )\>°.) by are convergent, with sums s and 

t respectively. Show that their Cauchy product is (C,1)-summable, with sum st. 
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This implies a theorem of Abel: if the Cauchy product is convergent then its sum 
is st. 

11.5.3 Pointers to Further Study 

— Abelian and Tauberian theorems. 

— Fejer’s theorem. 

11.6 (©) The Irrationality of e and z 


Euler published a proof that e is irrational in 1740. We present here a simple proof 
due to Fourier (early nineteenth century). 
We begin with 


We can estimate the tail as follows: 


ee 


k=m+1 


1 1 1 1 
= aai'+aa3+ wnesntwneabeset) 


1 1 1 1 
eae (| ae 
<aanil "Had (aoe ae ) 
_ 1 m+2 
~ (m+t)im+i 


since the series after the inequality sign is a geometric series. We obtain an estimate 
for the error 


aS | 1 2 
0<e > < es 
Mk! ~ (m+ Dim+1 


which leads to 


*) m+2 (11.1) 


0< m(« < : 
4 k! (m + 1)? 


If e = a/b, where a and b are integers, then the central quantity in the inequal- 
ities (11.1) is an integer when m > b. On the other hand the right-hand member 
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(m + 2)/(m + 1)? tends to 0 as m —> oo. It is clearly impossible that an integer can 
lie strictly between 0 and (m + 2)/(m + 1)? for sufficiently large m, in fact not even 
for m > 1. This proves the irrationality of e. 

The first proof of the irrationality of 2 was published by Lambert in 1761. Another 
proof, due to Hermite (second half of nineteenth century) is widely available in the 
books. 

Here is a proof, possibly due to Mary Cartwright (early twentieth century), who 
was mistress of Girton College, Cambridge. In any case she set it as an examination 
problem. The proof is in several steps, which Cartwright would have called exercises. 
To encourage the reader to adopt the right spirit, we give the solutions in a separate 
section. 

For each real @ let 


1 
(a) = | (1—x°)"cosaxdx, n=0,1,2,... 
-1 


Step 1. Derive the reduction formula 
aI, (a) = 2n(2n — 1)I,_-\(@) — 4n(n — 1)I,-2(@), n =2,3,4,... 


Step 2. Show that there exist polynomials P,(@) and Q,,(@), with degree less than 
or equal to n, and with integer coefficients, such that 


a"! 7,(a) =n! (P,(@) sine + O,(@) cosa), n=0,1,2,... 


Step 3. Prove the following: if 7/2 = b/a, where a and b are integers, then 


pnt b 
eee me pe 
n! a 
is an integer. 


Step 4. Show that the assumption that 2/2 = b/a, where a and b are integers, leads 
to a contradiction. In other words: zr is irrational. 


11.6.1 Solutions 


Step 1. Let n > 2. Integrate twice by parts and note that (1 — x?)” and (1 — x?)""! 
are both 0 for x = +1: 
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2n : 2\n-1 7 
I,(a) = ae (1 — x*)""'x sinax dx 
-1 


2 1 
= ai (a ee. yr?) cos ax dx 
ae J-1 
2 1 
=S/ (a-2y 1 4+2@-pa-2y" 
a” J-1 
—2n-)d- mr?) cos ax dx 
) 1 
= Sf] (@n- pax"! -2@- NU - Py") cosax dx 
ae J-1 
= — Ih (@) = — > In-2(@) 
a a 


and the reduction formula is proved. 
Step 2. We use induction. We assume that the claim is true for place numbers up to 
n — 1, that is, we assume that P; and Q; exist for k < n — 1 and satisfy 


a+" 7, (x) = k! (Pe (a) sina + Ox (a) cosa). 
By step | we then have 


ot 7, = Wn(2n — 1)a"1 1,1 — 4n(n — 1) ae" T,_> 
= 2n(2n — 1)(n — 1)!(Pr—1 sinax + Qn-1 cos ax) 
— 4n(n — 1)(n — 2)!a?(P,-2 sin wx + Qn—2 cos wx) 


and we obtain recurrence relations for P,, and Q,,: 


P, = 2(2n — 1) Py, — 407 Py_2 
On = 2(2n — 1) Qn-1 — 40 O,-2 


valid for all n > 2. 

From these relations we conclude that if P,_; and Q,_; are polynomials with 
degree less than or equal ton — | and with integer coefficients, and if P,,_» and Q,_2 
are polynomials with degree less than or equal ton — 2 and with integer coefficients, 
then P,, and Q, are polynomials with degree less than or equal to n and have integer 
coefficients. 

To complete the induction we must examine the initial values. They are 


so that Po, P}, Qo and Q, are polynomials with integer coefficients, and their degrees 
satisfy the required bounds. 
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Step 3. Suppose that 7/2 = b/a. Since cos(zr/2) = 0 and sin(z/2) = 1 the result of 
step 2 with a = b/a implies 


pent b b 

(2) nn (2) 
Now the degree of P,, is less than 2n + 1, and so a?”*! P,,(b/a) is an integer. Hence 
(p2n+1 /n!)I,(b/a) is an integer. This completes step 3. 
Step 4. We show that (b*"*+!/n!)1,(b/a) cannot be an integer when n is sufficiently 
high. In fact we have limp + b+! in} = 0, but also J,,(a@) < 2 (an obvious bound; 
actually [,,(@) tends to 0 for each w). From this we find that (p24! /n!)I,(b/a) tends 
to 0, but it is manifestly never equal to 0, for cos(zx/2) > O on the interval ]—1, I[ 
and so the integral is strictly positive. This is impossible because (b*”+! /n!) I, (b/a) 
is an integer for all n. 


11.6.2 Pointers to Further Study 


— Irrationality theory 
— Diophantine approximation 
— Transcendence of e and z 


11.7 Taylor Series 


Let f(x) be a function with domain A, an open interval, and suppose that f has 
derivatives of all orders. Fix a point c in A. The power series 


oS 


n=0 


U 


n) 
(c) G 
n! 


—c)" 


is called the Taylor series of f with centre c. The Taylor series with centre 0 (formable 
if 0 is in A) is, as we mentioned in Sect. 11.3, called the Maclaurin series of f. 
We list some important points: 


(a) The Taylor series of f can have radius of convergence 0. In fact, according to a 
theorem of E. Borel, any sequence of real numbers is the coefficient sequence 
of the Maclaurin series of some function. So, for example, there is a function 
having derivatives of all orders, whose Maclaurin series is eae n! x". 

(b) Even if the radius of convergence R is not 0, the sum of the Taylor series of f, 
forc — R <x <c+R, need not be equal to f(x). 

(c) The partial sums of the Taylor series of f are the Taylor polynomials with 
centre c. 
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(d) According to the theorem on the differentiation of power series (Proposition 
11.15), if a power series eee an(x —c)" is convergent, with sum g(x), for 
c—R<x<c-+R, then the series is the Taylor series of g with centre c. 

(e) A given power series can be the Taylor series of more than one function. 


A key example for understanding the pitfalls of Taylor series is the function f 
with domain R given by 


f(x) = ha 


1 
x 


, ifx >0 
ifx < 0. 


Obviously f has derivatives of all orders for x 4 0; but what about x = 0? 
By induction one may show that for x > 0 we can write 


Fra) = Pal Jer? 
Xx 


where P,, is a polynomial of degree 2n. Since the exponential function overwhelms 
any polynomial we see that lim,.9, f(x) = 0 for all n. A further induction (see 
Sect. 5.6 Exercises 6 and 7) now shows that all derivatives f (0) exist and are 0. 

But there is a further, immediate and surprising conclusion. All Taylor polynomials 
with centre 0 are 0. The Maclaurin series has every coefficient a, equal to 0, and its 
sum is O for every x. 


11.7.1 Taylor’s Theorem with Lagrange’s Remainder 


Proposition 11.22 Assume that f : A — R has derivatives of order up to n, let 
c € A, and let x € A. Then there exists € in the interval |c, x[ (if x > c) or in the 
interval |x, c[ (if x < c), such that 


nal ek) (n) 
fo= yt «of +f Sy c)". 
24 l 


Before proving this we shall discuss its meaning. The first expression on the right 
is the Taylor polynomial of degree n — 1 centred at c (as a polynomial it may have 
degree less than n — 1), and the second is the remainder, or error, that is incurred 
when the Taylor polynomial is used to approximate f(x). The phrasing implies that 
x #c; there can be no é in ]x, c[ if x = c as the interval is empty. 

The exclusion of x = c is of no real consequence. There is another form of the 
same expression that is sometimes easier to handle. Setx — c = handset =c+6h 
where 0 < 6 < 1. Then 
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nl Pik) (n) 
fet n= yO 4 LOCH; 
a k! n! 
When using this formula one must remember that / can be negative, just as in the 
first version x can be below c. Similarly |h| could be big. The only restriction on h 
is that c + h should lie in the interval A, which could be unbounded. In this version 
there is no problem with having h = 0. 
Both forms of the formula fail to indicate the important point that € and 0 depend 
on x,c and n. 
Taylor’s theorem takes as its starting point the formula 


f(x) = Py-1(x, c) =F Rix, c) 


involving the Taylor polynomial and the remainder, and the whole content is an 
assertion that the remainder may be expressed variously as 


(n) 
Ry(x, 0) = oy 
n! 
where & is between c and x, or as 
we Oh 
R,(c+h,c) = eee Bae 
n! 


where 0 < @ < 1. Both of these expressions are called Lagrange’s form of the 
remainder. 


First proof of Taylor’s theorem with Lagrange’s remainder We prove the 
proposition in the second version, the one in which x is replaced by c +h. Sup- 
pose that h £ 0. For0 < t < horh < t < 0 (depending on the sign of /), we set 


(k) n 
git) = fle+) ye ep (11.2) 


! 
n. 
k=0 


where the constant B is chosen so that g(h) = 0, that is, B is determined by 
Pe ye 
f(c+h) = a +p. 


We only need to show that B is of the form 
B= f™(c+6h) 


for some 6 between 0 and 1. 
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Differentiating (11.2) repeatedly with respect to t we obtain 


n—1 


gPO=f%C+N-Do 


k=j 


OO 5 
k=l =A 


B, (=0,1,...-D, 


and therefore 


g0)=0, (f =0,1,..n-1 
gr) = fMC+t)—B. (11.3) 


We apply the mean value theorem (actually Rolle’s theorem) repeatedly and 
exploit the first equation of (11.3). Recall that B was chosen so that g(h) = 0, 
but that we also have g(0) = 0. Hence, there exists h; between 0 and h, such that 
g'(h,) = 0; then, if n > 2 we have g’(0) = 0, and so there exists hz between 0 and 
hy, such that g”(h2) = 0; then, if nm > 3 we have g”(0) = 0, and so there exists h3 
between 0 and /, such that g’’(h3) = 0, and so on, terminating in /, between 0 
and h,_,, such that ge (An) = 0. But then h, is between 0 and h and by the second 
equation of (11.3) satisfies 


fC+ hy) —B=0. 


Set h, = 6h. We have 0 < 6 < land B= f™(c+6h). 


Second proof of Taylor’s theorem with Lagrange’s remainder For 0 < t <h or 
h < t < 0 (depending on the sign of h), we set 


n—-1 (k) 
g(t) = f(et+)->- et 
k=0 . 


Differentiating repeatedly we obtain 


n—1 

| iwone 
Dt) = fFM(e+n— k-j 
BP) = FMC+H) Dea 


’ (j =0,1,...n—1), 


and therefore 


g(0) =0, (i =0,1,..n—1), 
gH =fMPC+H). (11.4) 


We apply Cauchy’s mean value theorem repeatedly to the quotient g(t)/t” and exploit 
the first equation of (11.4). We obtain the string of equalities: 
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gh) gi) a) hn) 
nn nh! nin Way? n! 


where h, lies between 0 and A, and in general h;,, lies between 0 and h;. Putting 
h, = 0h we see that 0 < @ < | and the extreme members of the string of equalities 
together with the second equation of (11.4) give 


B=) p(k) (n) Oh)yh" 
retn-y EO oa = g(h) = rere ct = 


k=0 = 


11.7.2. Error Estimates 


When an approximation is used it can be useful to estimate the error. The main 
force of Taylor’s theorem resides in the information it gives about the remainder 
term. This can help us to estimate the error when a function is approximated by one 
of its Taylor polynomials in a part of its domain of definition. 

Let us look at some elementary functions, taking the point c to be 0, and write down 
Lagrange’s remainder. For all x 4 0 and m, there exists, in each of the following 
cases, € between 0 and x (not necessarily the same for each case), such that 


ae s AY 2 (1 (e088) 
ip Oho! (2m + 2)! 
a ee y ON et yt (cose) 
mo CRED! (2m + 3)! 
3 k é m+1 
(c) e = es + e : 
ao m+ 1! 


The sums appearing here are in each case partial sums of the Maclaurin series of 
the corresponding functions. The Maclaurin series for cos x has only even powers 
of x, that is, the coefficients of all odd powers are 0. In case (a) we can think of 
the series as extending up to the unwritten term 0.x7”+!, and is followed by the 
remainder that includes the power x~”*?. A similar remark holds for sin x, for which 
the even powers are missing. 

If the remainder tends to 0 asm — oo we get a proof that the function in question 
is the sum of its Taylor series. The problem is that we know nothing about € except 
that it is confined between 0 and x. However we can make the following estimates: 


|cosEe] <1, O0<e> < max(l,e’). 


Even without further knowledge of € we can conclude in all three cases that the 
error tends to 0 as m — ov, by exploiting the very useful limit lim,_,.. x”/n! = 0, 


11.7 Taylor Series 371 


itself a consequence of the convergence of the exponential series. Taylor’s theorem 
yields nice new proofs of the power series representations of the circular functions 
and the exponential function. 

However, the error estimates may seem disappointing. The problem is that we 
know nothing about that number &. For the circular functions we can only say that 
the error is bounded by the next unused term. For e* the Lagrange remainder does 
not look very useful for estimating the error when x > 0, as the only thing one can 
say about e§ is that it is less than e*. 

The truth is that these are very simple Maclaurin series for which it is easy to 
estimate the tail directly, giving results very similar to estimating the remainder term. 
But suppose one wants to approximate the function sin x on the interval [zr /6, 2/4] 
using its fourth-degree Taylor polynomial centred at 2/6? The remainder term here 
is a valuable source of information (see the exercises). 

A computer calculates transcendental functions by using polynomial approxi- 
mations tailored (pun intended!) to different parts of its domain of definition. An 
estimate of the error is essential so that we can be confident of delivering a minimum 
number of correct decimal digits. The remainder term offered by Taylor’s theorem 
was the first important tool for accomplishing this, although more sophisticated ones 
have since been developed. 


11.7.3 Error Estimates for n(1 + x) 


We shall consider the Lagrange remainder for In(1 + x), again taking c = 0. For 
each x in the interval —1 < x < oo there exists € between 0 and x, such that 


(-1)” xmtl 
él + mri m+ 1 


m7_4)\n-1 
nd +x =o” x4 


n=1 


Exercise Obtain this formula from Taylor’s theorem. 


The first expression on the right, if extended indefinitely, is a power series with 
radius of convergence |. Nevertheless the above formula holds for all x and not just 
those for which the power series converges. If x > | the series diverges and it is 
impossible that the remainder should tend to 0 as m —> oo. 

Analysis of the remainder term is a little tricky; certain difficulties will become 
apparent. We have 

(-1)" xml m+1 1 
+e tm +1 


= | (EE) weet 

In order to prove that the remainder tends to 0 it seems that we must show that 

|x/(1 + &)| < 1, whilst knowing nothing about € except that it lies between 0 and x. 
To make further progress we must consider some special cases: 
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(a) If0 <x <1 then0 <&,14+&> 1andso 


<x<l. 


x 

1+é 

The remainder term is bounded by 1/(m + 1) and so tends to 0. 

(b) If —3 <x < 0 then —4 <x<&<0,1+é> 5 > |x| and so 


Again the remainder term tends to 0. In fact it is bounded by 2~”"-!/(m + 1) so 
we have reasonably fast convergence. 

(c) Whatif—1 <x < —4$?There seems to be no way to ensure that |x/( + é)| <i, 
which was the key to showing that the remainder tended to 0. 


We seem to have a worse result on the representation of In(1 + x) as a power series 
than the one given previously and obtained by integrating the series for (1 + x)~!. 
There is though a small bonus. We obtain convergence at x = | and hence another 
proof of the formula In2 = )°°., (—1)""!/n. 


11.7.4 Error Estimates for (1 + x)“ 


We consider the Lagrange remainder for (1 + x)“ again taking c = 0. Since we wish 
to admit any real power a, we must assume that x > —1. We have 


k 
“ll + x)* =a(a—1)..(a—k +1) +x)** 


a Vata 1)...a-—k +1) 
(+x =>) . x* + Rngi(2). 
k=0 : 


The series, when extended indefinitely, has radius of convergence 1. Lagrange’s 
remainder gives 


RnyiG) = ae ar a a+eyo" lx"! (E between 0 and x) 


and so ( 1)...( ) 
MRni)] =| god 


Note that 1+ € > 0 because x > —1. 
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The remainder is again tricky to analyse. In order to prove that the remainder 
tends to 0 we could exploit the limit 


a(a—1)..(a-mM) yyy 


im = 0, 
m—> oo (m + 1)! 


which is valid if |x| < 1 (a detail for the reader to check). We would therefore seek 
to prove that the factor (1 + €)¢~”"—! is bounded as m — oo. For this it is enough if 
€ > 0 for each m (remember that € actually depends on m). 

We therefore have two cases. For 0 < x < 1 it is obvious that € > 0 and the 
remainder tends to 0. But for —1 < x < 0? There seems to be no way to ensure that 
(1 + €)*-"—! is bounded. 


11.7.5 Taylor’s Theorem with Cauchy’s Remainder 


The deficiencies of Lagrange’s form of the remainder are partially remedied by a 
different version: Cauchy’s remainder. The downside is that the formula is hard to 
remember, not to mention the proof. 


Proposition 11.23 Assume that f : A > R has derivatives of all orders up to n, let 
c € A, x € A. Then there exists &, in the interval |\c, x[ if x > c, and in the interval 
|x, cL ifx <c, such that 


cy + Gopi” Ey ee): 


n—1 
ae OG Le) 
a iy 


Again we rewrite this by letting x —c =h and € =c+ 06h, where 0 < 6 < 1. 
Then we have the slightly easier to remember formula: 


n—-1 
a fPO~ , FOC +Oh) 
fcth=> ao ay 


k=0 


d — 6)" !hr", 


Proof of Taylor’s theorem with Cauchy’s remainder For all t, such that c + ¢ is in 


A, we set 
n—1 


FQ) = f(e+h)- yo *f(e4) 


k=0 


and differentiate (for this proof a single differentiation suffices), as follows: 
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-l1 n—-1 (h — tk 
F'(t)= aes — (e+) oe ee) 
k=0 . 
n—-1 _ Hk 
yoo OF pec +t) y SEP OMe +H 
=0 k=0 : 
h —t)"- 1 


We apply the mean value theorem. There exists @ in the interval ]O, 1[, such that 
F (0) — F(h) = —AF'(6h). Noting that F(h) = 0 we obtain 


n—-1 hk gy)" Ip 
f(e+h)- va aa y= SO HM + 0h) 
k=0 , 


as was to be proved. 


Let us return to the problem of estimating the remainder for the function In(1 + x). 
We have 


m —] n—1 
In(1 + x) = > — + Rn+i(), 


n=1 
where Cauchy’s remainder gives 


(-D"@ — &)"x 


Rm4i(x) aa al = eym+l 


for some & between 0 and x. 

Previously we were unable to control the error when —1 < x < —5. Now sup- 
pose that —1 < x < 0. Then —1 < x < & < O and the reader should check that this 
implies that 


We therefore have 


(x — &)"x 


[Rin4i(x)| = ad 4 e)rtl 


G =)" |x| xj pe 
= < < 
1+€) 146 1+é l+x 


which tends to 0 as m — oo. 

Finally we return to the function (1 + x)“ where a is a real number. We recall 
that we were unable to control Lagrange’s remainder when —1 < x < 0. Cauchy’s 
remainder is 
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a(a — 1)...(a — m) 
ml 


G9 aa 


(for some € between 0 and x) 


a(a—1)..(a—m) yay aif *-& \" 
ml ae (a5) - 


Rin+1 (x) Za 


We would like to exploit the known limit 


inns a(a — 1)...(a — m) mtl = 


m—>oo m!\ 


’ 


a consequence of the convergence of the binomial series, but this requires gaining 
control of the remaining factors. 
Let —1 <x <0. Then —1 < x < & < 0, and this gives 


O<l+x<1+é<1 


and 
(1+&)*! < max(1, (1+x)*"'). 
Moreover 
x—& 
< —— <1 
x(1+é) 


Again the reader should check these three claims, bearing in mind for the second 
claim that a — | can be negative. 
Now we have 


a(a — 1)...(a — m) +i 


max (1, (1+ x)*~'), 
mM. 


|Rin41(*)| S 


and this tends to 0 as m —> ov. 


11.7.6 Taylor’s Theorem with the Integral Form of the 
Remainder 


Lagrange’s remainder and Cauchy’s remainder can be maddening because we do not 
know that crucial number &. It is therefore a relief to have a form of the remainder in 
which everything is known. There is a small price; the premises are slightly stronger, 
though not such that it matters much in practice. 


Proposition 11.24 Assume that f : A > R has continuous derivatives up to order 
n, letc € A, x € A. Then 
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n—-1 (k) x 
f= aot + ef ow ewan. 


a k! 


Proof Write 
_ 1 7 _ n—1 ¢(n) 
Rte) = | uy"! f Ww) du. 


Integrating by parts we obtain a reduction formula 


Ry(x) = 


Feu (u)(x _ uy! net 7 oy eee cE (x = uy" 2 pln 1) (u) du 


1 
(n aes 


= 5 —t (x — ene 


=m 


provided n > 1, and repeated applications lead to 


1 
(n — 1)! f"™ YO cy"! ao f"(c)(x = cy? 


ros Se coals — f'(e)\(x — c)+ R(x) 


R,(x) = = 


n—1 (k) 
--) f aot c)E + Ry (x). 


k=1 


But R, (x) = f(x) — f(c) and we are done. 


As in previous versions of Taylor’s theorem we rewrite the formula. Setu = c+ s, 
x =c-+h, and write the integral with respect to s instead of u. This gives 


em 1 “h yr-! pr )d 
-oof —s f° (ct+s)ds. 


Finally let s = ht and write the integral with respect to t. We find 


Rs gf amore t nae 
"(a —1)! Jo 


This form of the remainder is often useful. For example, a variety of different 
remainders, of which Lagrange’s and Cauchy’s are mere special cases, can be derived 
from it. Let 1 < p <n. We first write 


n 1 
R, = —a | (—1)" 9 f™(ce+thy\ —t)?"! dt. 
Ge — DE Jo 


Using the mean value theorem for integrals (Proposition 6.15) we deduce that there 
exists 9 in the interval ]0, 1[, such that 
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= nd — 6)" Pp f™(ce+6h) i 
n= @—D! [ ad t)P- dt 
_ h"(1 —6)"- Pf™(e+6h) 
7. p(n—1)! 


This version of the remainder is known under several names, most frequently 
Schlomilch’s remainder or Roche’s remainder. The case p = 1 is Lagrange’s remain- 
der; the case p = n Cauchy’s. 


11.7.7 Exercises 


1. Calculate ./28 to four correct places of decimals by using an appropriate Taylor 
polynomial of </x centred at x = 27. 
2. It is proposed to approximate ,/x in the interval 64 < x < 70 by 


(a) Its first-degree Taylor polynomial centred at x = 64. 
(b) Its second-degree Taylor polynomial centred at x = 64. 
(c) Its third-degree Taylor polynomial centred at x = 64. 


In each case estimate the maximum error. 

3. Itis proposed to approximate sin x on the interval 2/6 < x < 1/4 by its fourth- 
degree Taylor polynomial centred at x = 2/6. Determine the polynomial and 
estimate the maximum error. 

4. Suppose we want to use the exponential series truncated after the term x” /m! 
to calculate e* for 0 < x < 1. There are several ways to estimate the error, from 
using a form of the remainder provided by Taylor’s theorem, to estimating the 
tail of the series directly. 


(a) Obtain the upper bound ex’"*!/(m + 1)! for the error using Lagrange’s form 
of the remainder. 
(b) Estimate the tail directly. Show that 


3 x (m + 2)x™t! 
a kl ete! 


Which estimate of the error is lower? 
Note. An algorithm that calculates e* would only need to calculate it in the first place for 0 < 
x < 1. Values for other inputs can then be found by a further finite number of multiplications 
or divisions by e. 

5. The error function is defined for all real x by 


erf(x) := z [ eat 
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Obtain the Maclaurin series of erf(x) and show that it converges to erf(x) for 
all x. 

Note. The error function is anon-elementary transcendental function. It is important in probabil- 
ity and statistics. The factor 2/,/z ensures that lim,—, oo erf(x) = 1. See Sect. 12.2 Exercise 2. 


. Show that if f is a polynomial of degree m then for every c we have 


m 


(k) 
fo=y OG - ot. 


k=0 


. In the exercises in Sect. 5.8 some so-called numerical approximations to deriva- 


tives were studied. Using Taylor’s theorem one can estimate the errors. In each 
item we assume that a is a point in the interval of definition of f, that |h| is 
sufficiently small that a + h falls within A, and that f has enough derivatives in 
A so that the statement makes sense. We begin with the usual difference quotient 
so that a comparison can be made. The intermediate value property of derivatives 
(Sect. 5.6 Exercise 8) can be useful. 


(a) Show that for each h ¥ 0 there exists € in A, such that 


flath)— f@ 
h 


; 1 
f'@ = shf®). 
(b) Show that for each h ¥ 0 there exists € in A, such that 


flath)— fa—h) 
2h 


1 
fa) = eh FO). 


This suggests that ( flath) -— fla- h)) /2h could be a superior approxi- 
mation to f’(a) than (f(a +h) - f(a))/h. 
(c) Show that for each h 0 there exists € in A, such that 


flath)— ae TH@=*) _ Oy — Si f%G). 


. A popular method to interpolate between known values of a function is to use 


the chord joining the two points on the graph. Thus if we have points (a, f(a)) 
and (b, f(b)) on the graph y = f(x) with a < b, the suggestion is to use 


(b—x) f(a) + (x — a) f(b) 
b-a 


as an approximation to f(x) at points x between a and b. This is known as the 
method of proportional parts and has a long history. 

Assuming that f is twice differentiable, estimate the error by showing that for 
each x in the interval Ja, b[, there exists € between a and b, such that 
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PO OI 2G =O IOO. 
= 2 


f(x) 


Deduce that if | f (x)| < M in the interval [a, b], then 


-xIi@+@-afO)) _ ! 


2; 
f(x) a yo - ay. 


Hint. Leth = x — aandk = b — x. Expand f(a +h) and f(b — k) by Taylor’s 
theorem and eliminate the term involving f’(a). 


. Let the function f be defined on all of R and have derivatives up to order 2. 


Suppose that | f(x)| < A and | f”(x)| < B for all x. Prove that 
If’ (x)| < 2VAB 


for all x. 
Hint. Use Taylor’s theorem in the form 


fe th) = fx) +hf'(x) + si + 6h) 


and show that P P 
’ <-A+-—B 
If’ @)| < fh + 5 


for all h. 


. Prove the following theorem of S. N. Bernstein: 


Let f have derivatives of all orders in an open interval A. Let c and d be 
points in A such that c < d, and assume that f™() > 0 for all x in[c, d] 
and for all n. Then the Taylor series of f with centre c converges to f in 
the interval [c, d]. 


You might use the following steps to prove the theorem. Let r = d — c, and for 
each x in [c, d] let h = x — c and view the remainder at x as a function R,, (A) 
of hfor0<h<r. 


(a) Show that 
h® 
0 < Rath) < —R, (7). 
r? 


Hint. Use the integral form of the remainder. 
(b) Show that R,(r) < f(d). 
(c) Deduce that lim,-... R,(h) = 0 forO <h <r. 


. Show that the Maclaurin series of tan x converges to tan x throughout the interval 


]—2/2, w/2[ and deduce that its radius of convergence is 1/2. 


. Let g be an odd function on the interval ]—r, r[, with derivatives up to the fifth 


order. Show that for each x in ]—r, r[ one can write 
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ae 22'(0 v6) 
g(x) = 3(8 (x) + 2g" Y= aane (€) 


for some € between 0 and x. 
Hint. Apply Cauchy’s mean value theorem to the quotient 


g(x) — £9'(x) — 2g'(0) 
5 


Xx 


and remember that g being odd, g“(0) = 0 for all even n. 


. The argument used in the proof of Proposition 11.23 to obtain Cauchy’s form of 


the remainder can be adapted to obtain Schlémilch’s, or Roche’s form, under the 
same conditions. Define the function F(t) as in the proof of Proposition 11.23 
and let @(t) = (h — t)? for some exponent p > 1. Apply Cauchy’s mean value 


theorem to the quotient 
F(A) — FQ) 


o(h) — (0) 
and obtain the remainder in the form 


— fPCe+ on nr -6)"? 
7 p(@-1)! 


n 


Note. This was previously obtained from the integral form of the remainder under slightly 


stronger conditions (the nth derivative of f was required to be continuous). 


. An example of a function whose Maclaurin series has radius of convergence 0. 


Define the function 
oe) 
f(x) = le cos(n?x), (x € R). 
n=0 


(a) Show that f has derivatives of all orders. 
(b) Show that 


fe (0) = (-1) > ent 


n=0 


for all k. 
(c) Estimate | f (2) (Q)| from below and deduce that the radius of convergence 
of the Maclaurin series of f is 0. 


. In this exercise you are asked to show that the Riemann zeta function f(s) = 


><, 2° has derivatives of all orders in the domain s > 1. Moreover its deriva- 
tive can be obtained by differentiating the series term-by-term. We assume that 
s is real, but the result (as well as the proof as suggested below) holds equally 
for complex s with Res > | in the context of complex derivatives. 
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(a) Letk be a positive integer. Show that the series ee n~(Inn)* is convergent 
for every s > 1, and that for all 5 > 0 it is uniformly convergent with respect 
tos>1+6. 

(b) Deduce that ¢(s) has derivatives of all orders and 


(oe) 


cs) = n~*(—Inn)*. 


n=1 


11.8 () Bernoulli Numbers 


In this nugget we shall meet the Bernoulli numbers, a sequence of numbers that 
seem to crop up unexpectedly in all areas of pure and applied mathematics. Interest 
in them was such, that they were the subject of the first computer program (Ada 
Lovelace, 1843). The pointers to further study given at the end could easily read “all 
of mathematics”. 

The rule 


a 1 
eS gna + Yn +1) 
k=1 


is an easy consequence of the fact that if g(x) = zx (x + 1)(x + 1) then 


x? = g(x) — g(x — 1). 


To obtain a comparable rule for the sum )~;_, k?, or more generally for a sum 
yre1 f (kK), where f(x) is a polynomial, it would be enough to find another poly- 
nomial g(x), such that 


f(x) = g@) -— g(x — 1). 
Then we would have 


Y> f(k) = gn) — 8 (0). 


k=1 
Actually it is a little more convenient to seek g(x), such that 
f(x) = ge +1) — g@). (11.5) 


The conclusion is then 


> fk) = a(n t+ 1) — g(l). 


k=1 


382 11 Function Sequences and Function Series 


The involvement of the Bernoulli numbers in the solution of the functional equation 
(11.5) is one of the surprises of mathematics. 
Since g(x) is supposed to be a polynomial, we have, by Taylor’s theorem 


(oe) 


1 
gie+ I=) pax). 


k=0 


The sum is actually finite since all derivatives g(x) are 0 for k higher than the 
degree of g(x). Let D denote the operator of differentiation with respect to x, that is, 
for a given function u(x) we now write Du for u’. The advantage of this notation is 
that multiple differentiations are written as a power of D acting on uw. Formally we 
would like to write 


ge+D=>> 7a) =| =P") 9) = ? g(x). 
k=0 ~ k=0 ~ 


The operator e? transforms g(x) to g(x + 1). But of course e? has no properly 
defined meaning yet. It is a piece of useful nonsense such as has often been the 
source of important discoveries in mathematics. 

Let us continue calculating, using e” as if it was a well defined thing. Given the 
polynomial f(x) we seek a polynomial g(x), that satisfies 


e? g(x) — g(x) = f(x). 


Write this as 
(e? — 1)g(x) = f(a). 
Shouldn’t the solution be somehow 


1 


eP—] 


g(x) = f(x)? 
We can even guess how to work this out. We continue without regard for rigour. We 


write 
1 1 ¢ 


faoN tear 


Now t/(e’ — 1) extends to a function on R that has derivatives of all orders; we only 
have to define its value as 1 for t = 0. It has a Maclaurin series 


[oe 
B 
a 


> 
ll 
o 
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where the coefficients B, are known as the Bernoulli numbers. They are all rational, 
as we Shall see. 


A possible interpretation of — i f(x) might begin with 
ep — 


= —qi@=5( tp) rw = 5 tw. 
! 2. i! 


k=0 


Let m be the degree of the polynomial f. The infinite sum reduces to a finite one 
and we have the formula 


Tl 
SS, 
iM 
|B 
be 
& 
e 
— 
NX 
S 


= bof fdas + EPO, 
k=1 


where we have naturally interpreted 1/D as the instruction to find an antiderivative. 

We proceed to justify this calculation. Basic to this is the following proposition, 
that enables us to calculate the Maclaurin series of 1/f from that of f by algebra 
alone. 


Proposition 11.25 Let f be a function, with domain |—r, r|, and derivatives of all 
orders. Assume that f (x) 4 0 for all x in |—r,r[. Then 


(1) The reciprocal 1/f has derivatives of all orders in |—r, r[. 
(2) Let 4 anx" be the Maclaurin series of f and let San b,x" be the Maclaurin 
series of 1/f. Then for each m there is a polynomial Q(x) (depending on m), 


such that 
(Seam) (ibe) = 14000, 
n=0 n=0 
Proof Conclusion 1 is a consequence of the formula (1/f)’ = —f’/f?. We can 


clearly differentiate repeatedly. 
As for conclusion 2, we know, by Proposition 5.14, that 


m 


Do nx" = F(X) +x" 8(2), yh, x" = = — + x™h(x) 
n=0 n=0 “Fe x) 


where g and /) are continuous in ]—r, r[ and g(0) = h(0) = 0. Therefore 
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( D ans") ( 2 bnx") = (f(x) + x"8(x)) (z + v"n(s)) =1+x"RGQ) 
n=0 n=0 f@) 
where 
_ gx) a 
R(x) = SH + FO)A(e) + 2x" gx )A(X). 
f@) 


The last two equations show that R(x) is of the form x~”S(x), where S(x) is a 
polynomial, but also that R(x) is continuous in ]—r,r[. This can only hold if the 
polynomial S(x) is divisible by x”, so that in fact R(x) is itself a polynomial. More 
than that, R(O) = 0, so that R(x) is divisible by x. Then R(x) = x Q(x) and Q(x) 
is a polynomial, as required. 


It is noteworthy that the proposition in no way requires convergence of the Maclau- 
rin series of f and 1/f; they may diverge, or even converge but to functions different 
from f and 1/f. 

Specialising to the case in hand, we have 


m 1 m B 
(2 ra vale i) =a +x"t! Q(x), 


where Q(x) is a polynomial (which depends on m). All the functions in this equation 
are now polynomials; so we can replace x by the operator D and assert the following 
relation involving differential operators: 


. iL k ~ By k\ _ m+1 
(Sagem (Lge) at+e O(D). 


k=0 


This we multiply by D to yield 


m 1 m By 
> rear le> ip) =D+D""Q(D). 
(> (k+ 1)! rare 
Let f(x) now be a polynomial of degree m. We seek g(x) that satisfies 


g(x + 1) — g(x) = f(). 


Let F(x) be an antiderivative for f(x). Then F(x) has degree m + 1 and so 


m 1 m B 
a p)/ Bk D') F(x) = DF (x) + D".Q(D) F(x) 
(> (k+ 1)! d k! 


k=0 
= f(x). 
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The solution to the problem g(x + 1) — g(x) = f(x) is therefore 


B 
g(x) = (x = b') Fis) = BoF (x) + a pk! F(x), 
k=0 . 


precisely the formula that we had guessed. 
Now we can write down the desired sum formula, the original motivation for all 
these calculations: 


Yo f®) = BFat+)+>> DF Fin +1) - BoF(1)- > iD FA) 
k=1 k=1 k=1 


n+l m 
k-1 k-1 
= By [ po’, (Dt fn +1) — DEF). 
It is interesting to replace n by n — | (on both sides of course), use the facts that 
Bo = 1 and B, = —5, and add /f (7) to both sides. The result can be written as 


m 


Dw = [r+ ara? 2: (DI fin) — DI FN). 


This should be compared to the result of Sect. 8.1 Exercise 2. It is again an instance 
of the Euler—Maclaurin summation formula. 


11.8.1 Computing the Bernoulli Numbers 


One can compute the Maclaurin series of 1 /f from the Maclaurin series of f by purely 
algebraic means, without requiring convergence of the series (compare Sect. 11.4 
Exercise 18). We start from the equation 


m 


1 a 2 
02 (k + mele? i) = 1+x"*1Q(x). 


Setting x = 0 we deduce that By = 1. We can multiply the polynomial factors on 
the left-hand side, obtain the coefficient of x” in the product for n = 1, 2, ...,m, and 
equate it to 0. This gives 


n 


Scere 
kn —k+1)! 


and there is clearly no cap on n since we may raise m as much as we like. It is 
convenient to multiply by (7 + 1)! and use binomial coefficients. We obtain the 
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recurrence relations for the Bernoulli numbers 
és 1 
ai a 20. ed oS. 
k=0 


Some samples, after By = 1, are the relations: 


Bo+2B,}=0 3S B=-H= 


Bo + 3B, +3R,=0 3 bh=- 


Bo + 4B, +6B,+4B,3=0 => B=0 
1 


~ 30 
Bo + 6B, + 158. + 20B3+15B4,+6B;=0 => B=0 


Bo +5B,+108B.+10B3+5B4,=0 > B= 


Bo + 7B, + 21B. + 35B3+35B4+21B5+7B]=0 => Bo= a5 


and so on. 


11.8.2. Exercises 


. Prove that all the Bernoulli numbers are rational. 

. Prove that B, = 0 ifn is an odd number higher than 1. 

. Obtain a formula for }°7_, k? as a polynomial P(n) of degree p + 1. 

. Prove the following formulas, in each case giving a lower estimate for the number r 
in the range of mits (not necessarily the same in both formulas): 


1 =< 10 ge Box 2k 1 
(a) cotx= as (2h)! > 


BRWN Re 


(0 < |x| <r). 


a 127k (27% — 1) Box 94 
(b) tanx = xe I! ek 


(-r <x <r). 
k=1 


11.8.3 Pointers to Further Study 


Euler—Maclaurin summation 
Faulhaber’s formula 
Number theory 


a 
= 
sy 
— Combinatorics 
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11.9 (O) Asymptotic Orders of Magnitude 


Consider the function f(x) = x> + 2x4 + 3x? + 4x +5. Suppose that we want to 
give some idea of f(x) when x is large. Of course the question is rather vague. What 
meaning should we attach to “some idea”? What meaning to “large”? 

It all depends on context and purpose. It might be adequate to point out that 
limyoo f(x) = c; f(x) is simply really big when x is really big. We may want to 
give some idea of how big f(x) is when x is really big. Then we might point out 
that f (x) grows at about the same rate as x>, vanishingly slowly compared to x°, but 
much faster than x*. 

To express such statements concisely, a notation has been devised and was pop- 
ularised largely by the mathematician Edmund Landau. In the above example we 
had a function f(x) and we wished to compare it to powers of x for large x. More 
generally let h(x) be a function, positive on an interval ]K, oo[, which will be used 
for comparison. We write 


f(x) = O(A(x)), (x > 00) 

to mean that f(x)/h(x) is bounded on some interval [L, oo[. We write 
f(x) = o(h(x)), (x > ov) 

to mean that limy_.o0 f(x)/A(x) = 0. 


The notation can be used for sequences. Let (a,)°°_, be a sequence and let (hy) °° , 
be a positive sequence, intended for comparison. We write 


an = O(hy), (n > oo) 
to mean that a,/h, is bounded. We write 
Gy = O(n), (1 > 00) 


to mean that limy..5 Qn /hyn = 0. 
Thus 
x + 2x44 3x7 +4+4x+5= 0(x°), (x >) 


x9 4 2x44 3x7 +4 +5=0(x°), (x > 00). 


It should be understood that these are statements about the functions named here; the 
variable x is in fact a bound variable. This can be seen by spelling out the meaning 
in full, which requires the quantifier “for all x > L”. 

Now it has to be said that there is something a little strange about the use of an 
equality sign here. The apparent equation expresses that f has a certain property; 
for example f(x) = O(1) says that f is bounded in some interval [K, oo[. And we 
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certainly cannot write O(1) = f(x) to express the same thing. Nevertheless, certain 
properties of equality do obtain here; for example if f(x) = O(h(x)) then it may be 
possible, if care is taken, to substitute O(h) for f in a relation that contains f. We 
shall see examples of this. 

The notation extends naturally to other destinations than oo. For example, the 
statements 


f(x) = O(h(X)), (4X > a) 


and 


f(x) = o(h(x)), («> a) 


mean, respectively, that there exists 6 > 0, such that f(x)/h(x) is bounded for 
0 < |x —a| < 6, and lim,_., f(x)/h(x) = 0. 

What makes the notation useful is a certain flexibility. For example an expression 
O(h) can appear in algebraic combinations. Some examples follow with elucidations 
of the meaning where appropriate. The proofs are left as exercises. 


(a) O(x") + O(x”) = O(e™™™O™) (x > 0). 
Meaning. If f(x) = O(x") and g(x) = O(x"") then f(x) + g(x) = O(xminin.n)), 


Here m and n are integers, positive or negative. Real m and n could be admitted if x 
approaches 0 from above. 
Exercise Write the corresponding statement in the case x > oo. 
(b) x” O(x") = O(x""™"), (x > 0). 
Meaning. If f (x) = O(x") then x” f(x) = O(x™*"), 
1 


(c) faay =1—x+0(x”), («> 0). 


1 
Meaning. aoe l+x= O(x’). 
x 


1 
L+x+O(x2) 
Meaning. If f(x) = O(x?) then 


(d) 1—-x+O0(x2), («> 0). 


1 


= 2 
errr gale a OD 


Formula (d) can be thought of as the result of substituting x + O (x?) for x in formula 
(c), thus exploiting the true formula x = x + O(x?). It is particularly useful for 
obtaining asymptotic information about quotients from a few basic Maclaurin series. 
Here are some further examples. The derivations are left as exercises. 
1 1 


x 
(e:) --~—=-=+0(x3), @> 0). 
x  sinx 6 
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One can obtain this by using sinx = x — (x3 /6)+ O(x°). We deduce from (e) that 


1 1 1 
lim - — , 
x>0 x2 xsinx 6 


The calculation of this limit using L’ Hopital’s rule is much longer. 


i to ae eh: Bey 
= —_> é 
sinx x | 6 | 360 ee Ne 


We deduce from this that 


1 1 1 x 7 
lim - — : 
x>0x3 \sinx x 6 360 


Obtaining this by L’Hopital’s rule is a long haul; compare the limits in Sect.5.8 
Exercise 2. 

nm —n +1 _ 2 
n+n34+1 n2 


(g) 


2 1 
+5+0(5). (n +> oo). 


11.9.1 Asymptotic Expansions 


We recall Proposition 5.14, sometimes called Peano’s form of Taylor’s theorem. 
This may be expressed using Landau’s notation. Suppose that f has derivatives of 
all orders at the point a. Then, for each n we have 


(k) 
fy = @) (x — a)‘ + o((x —a)"), (x > a). 


k=0 


This is an example of an asymptotic expansion. Note that the convergence, or other- 
wise, of the Taylor series 


ia (k) 
y- LPM (x- ak 


a k! 


is irrelevant. It could even be divergent for all x 4 a. 

The sequence of functions (x — a)”, (n = 1, 2,3,...), form what is called an 
asymptotic scale as x — a. More generally an asymptotic scale (as x > a) is a 
sequence of functions h,(x), (n = 1, 2,3, ...), such that for each n we have 


Anyi(x) = 0(An(x)), (x > a). 


An asymptotic expansion of a function f relative to the asymptotic scale (h,)°°, is 
then of the form 
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n 


f(x) = Do cehe(x) + o(fin(x)), («> a) 


k=1 


where the coefficients c, are real numbers, which can be continued up to some index 
k = N, or else indefinitely. 

In spite of the ubiquity of Taylor series it seems unusual for them to be divergent 
in practical applications. The most common type of asymptotic expansion is one that 
gives successively better approximations as x — oo. 

We shall explore one simple example of this. For x > 0 we define 


| 
Ea) = [ elds, 
Grek 


There is no problem with the integral at t = 0 since the integrand tends to 0. 
Successive integration by parts leads to a reversed reduction formula. It is simplest 
to describe it by letting 


_ (Dn! 


Th xn 


1 
gg: n=0,1,2,... 
0 


Then integration by parts gives 


_ )tnte* 


I= yt n+l1- 


Exercise Derive this formula. 


Now we obtain, starting at Jo, 


—y wo (DEK! 
E(x) =e yer 
k=0 


We shall see that this is an asymptotic expansion as x —> ov relative to the asymptotic 
scale 


h,(x) =e (n = 1,2,3,...). 


xntl? 
Exercise Check that hy1(«) = 0(An(x)), (x9 > 0). 


Next one has to check that [,4)(«) = o(hn (x)). This reduces to showing that 


1 1 
a ; g(t 
lim ef Me" dt = lim | the *G-) dt =0. 
0 


X—>0O x—> 00 0 


Exercise Prove this. One way is to show that for each 6 > 0 the integral from 0 to 
1 — 6 tends to 0 as x — ow, by noting that the integrand converges to 0 as x — on, 
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uniformly with respect to ¢ in the interval [0, 1 — 6]. The remaining part of the 
integral, from | — 6 to 1, is bounded by 6. 


In this example the series )~y_ (—1)*k!/x*+! is divergent for all x > 0. Never- 
theless its terms initially decrease; in fact they do so as long ask < x. Looking ahead 
to the next nugget, on Stirling’s approximation to k!, we can see that with increasing 
k, the term k!/x*+! reaches a minimum at around k = x of around /2zk e~*. For 
k = 10 this is about 10~* so we might expect the sum of ten terms to give a nice 
approximation to E(10). This is the case. 


Exercise Show that |/J,41| < e-*n!/ x"+1\ and therefore the error is less than the last 
term used of the expansion. 


11.9.2. Pointers to Further Study 


— Special functions 
— Asymptotic expansions 


11.10 (©) Stirling’s Approximation 


The factorial function n! grows rapidly with increasing n, and it quickly becomes 
impossible to compute it by multiplying together all integers from 1 to n. In the 
simplest instance Stirling’s approximation compares n! to n”, which is easier to 
compute. 

A good clue as to how to approximate n! comes from writing 


Inn! = yoink. 
k=2 


This suggests that we could approximate Inn!, or equivalently the sum }°y_, Ink, 
by comparing it to the integral he In x dx, which is equal to nInn —n + 1. So we 
should compare n! ton"e~"*!. We have arrived very near to Stirling’s approximation. 


Proposition 11.26 (Stirling’s approximation) 


! 
litt <= or, 


N>OO ylt7z ern 


According to this, we can use n"*2e-"./In asan approximation to m! when n is big. 
This is an example of an asymptotic approximation. For example 10! = 3628800 
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and Stirling’s approximation is 3598696. The error is within 1%. For larger n we 
can expect the percentage error to decrease, though the actual difference error may 
increase. A stronger result yields upper and lower bounds: 


ie ys te 
n2e "Jon <n <n"tie "tm V/27. 


The proof of Stirling’s approximation is in several steps and builds strongly on 
the strict concavity of the function In x. The reader might like to treat these steps as 
exercises; So we give the proofs in a separate section. 

Step 1. Ifr is anatural number and r > 2 then 


1 


r+5 1 r 
/ “in ae eine, 5(In(r — 1) + Inr) ef vere 
r— r-1 


2 
Step 2. 
n 1 n 
/ Inx dx < In(n!) — re < 7 Inx dx. 
3 1 


2 


Step 3. Let u, = In(m!) — (1+ 5) Inn +n. Then 


(1 -1n(;)) <u, <1. 


Step 4. (uy)f, is a decreasing sequence. 
Step 5. The limit 


n! 
lim ————_ =A 


n>oo nts eon 


exists and 


Gy ize 


Step 6. A= J/2n. 


11.10.1_ Proofs 


Proof of step 1 For the first inequality we use the strict concavity of the function 
In x. Its graph lies beneath its tangent at the point (r, Inr). The slope of the tangent 
is 1/r. Hence we find 


x-—r 
Inx < Inr + ——, (*#r) 
7 
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Integrate from r — 5 tor + 5. We find 
1 


rts 
/ Inx dx < Inr. 


ae 
2 


For the second inequality we again use the strict concavity of Inx. The chord 
joining (r — 1, In(r — 1)) and (7, Inr) lies beneath the graph. That is, 


(x-—r+1))Inr+(r—x)In@—1) <Inx, (r-—l<x <r). 


Integrate from r — 1 tor. We find 


1 1 
pe ae I< [ nxax. 


Proof of step 2 The first inequality from step 1, together with the fact that In x is 
increasing, gives 


n n r+5 n+ n 1 
/ inxdx => f inxdx — f Inxdx < )Inr— 5 Inn 
3 1 
Pere? I r=2 


2. 2 


leading to 


1 
/ Inx dx < In(n!) — 5 inn. 
3 


2 


Sum the second inequality of step 1 from r = 2 tor =n. We find 


= in ((n I!) + Fina!) < f inxedx 
2 BO ee iy 


or 


1 n 
In(n!) — xan <| Inx dx. 
1 


Proof of step 3 Compute the integral in step 2. We find 


| =In(>) +5 n(n — 44 1 +1 
ninn n ano) ee ki 7 ae ae n 


which implies the sought-after inequality. 


Proof of step 4 It is easily seen that 


ae es (n 5)(inn In(n — 1)). 
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Compute the integral in the second inequality in step 1. We obtain (with n in place 
of r): 


(In(n — 1) + Inn) <nInn—n—-(n—VIn—1)4+n-1 


NIle 


or 


L<(n 5) (inn In(n — 1)) 


so that u, < Up_. 


Proof of step 5 That the limit exists follows since u, is decreasing, and it satisfies 
the estimates of step 3. The lower estimate for A is about 2.43. 


Proof of step 6 Let 
! 
an = —— 
n'tze7n 
Then we have 
an (2n)! n2ntl e—2n Z (2n)! 7 (2n)!./n 
a Qnyttem (nl)? (nl)? (nt? 22" V2" 


Wallis’s product for 2 (see the exercises below) is 


a (n!)2 gen 
VES Onign’ 


We conclude that 


so that A = / 27. 


11.10.2 Exercises 


1. Prove Wallis’s product for 7. You can do this in the following steps. First set 
L, = tis sin"x dx. 


(a) Prove that Ign —1 > Lom > lom41.- 
(b) Show that 
- lom+1 
lim —— 


moo [pm 1 


= 1. 


Hint. Use Wallis’s integrals, (see Sect. 8.2.6). 
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(c) Show that 


é Ton 
lim —=1. 
moo [pn 4 


(d) Deduce Wallis’s product by applying Wallis’s integrals. 
2. With the aid of Stirling’s approximation, determine whether the series 
5 Ly" (2n)! 
= 22" (n!)? 


is absolutely convergent, conditionally convergent or divergent. 


11.10.3 Pointers to Further Study 


— Gamma function 
— Computational mathematics 


Chapter 12 M®) 
Improper Integrals gio 


Hardy in his thirties held the view that the late years of a 
mathematician’s life were spent most profitably in writing books 


J. E. Littlewood 


12.1 Unbounded Domains and Unbounded Integrands 


The definition of the Riemann—Darboux integral assumes that the function is bounded 
and its domain is a bounded and closed interval. It is desirable to have an integration 
theory that can be applied directly if the function is unbounded, or the domain is 
unbounded, or both. This is accomplished with the Lebesgue integral and deserves 
properly a book of its own. 

It is however possible to enlarge the scope of the Riemann—Darboux integral 
by introducing improper integrals. These are defined as limits of proper (ordinary) 
integrals. There is really nothing improper about improper integrals. The name simply 
reflects the fact they are not defined by the normal method laid down for the Riemann— 
Darboux integral, but by one that builds on it through an additional limiting procedure. 

As examples consider the two integrals: 


1 1 ee) 
— dx, / e “dx. 
I JX 0 


These are improper integrals and are defined as limits of normal integrals. Riemann— 
Darboux integration is not immediately applicable; in the first integral the integrand 
is unbounded so we cannot form upper or lower sums; in the second the interval 
is unbounded so we cannot partition it into finitely many bounded intervals. These 
typify the two primary cases, the only ones we shall consider here. 
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Case A. For every ¢ > 0 the function f is bounded and integrable on the interval 
[a + €, b], but unbounded on [a, b]. The improper integral from a to b is defined as 


the limit 
b b 
/ f = lim f 


e>0+ Jate 


if the limit exists. The endpoint b can be handled in a similar way if f is integrable 
on [a, b — €] for each e. 

Cauchy’s principle gives a necessary and sufficient condition for the limit to exist 
and be finite: for all ¢ > 0 there exists 6 > 0, such that | ie Ff | < € for all x and y 
that satisfya<x<y<ato. 

Case B. For every L > 0 the function f is bounded and integrable on the interval 
[a, L]. The improper integral from 0 to oo is defined as the limit 


[rein fs 


if the limit exists. Integrals with lower limit —oo are handled in a similar way. 

Again we have by Cauchy’s principle a necessary and sufficient condition for the 
limit to exist and be finite: for all e > 0 there exists K > 0, such that | tes fF | < efor 
all x and y that satisfy K <x < y. 

In each case, if the limit exists and is a finite number, we say that the improper 
integral is convergent. If the limit is infinite, or does not exist, we say that the integral 
is divergent. 

An integral such as iia f is improper at both ends. We say that it is convergent 
if it is convergent at each end separately. This means that each of the integrals 


oo 0 
/ f and i f 
0 —0o 
is convergent. 


Another example of an integral improper at both ends is 


1 
/ Peas @ ee) eae 
0 


This is considered convergent if it is convergent at both ends; for example, if both 
the integrals 


1/2 1 
/ x /?#q—x)7/? dx and / e PU =a) us 
0 1/2 


are convergent (the choice of where to split the interval clearly does not matter). 
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12.1.1 Key Examples of Improper Integrals 


We compile a list of improper integrals that can be used as yardsticks for studying 
the convergence or divergence of a large number of cases. We assume that p is a real 
number. 


CO 

1 

(1) / = dx is convergent (at oo) if and only if p > 1. 
1 Xx 


1 
1 
(2) i ee dx is convergent (at 0) if and only if p < 1. 
Oe 
(3) / — dx is divergent at both ends. 
0 Xx 


To check these claims we note that if p ~ 1 we have 


i 1 xl-P 
dx = 

1 xP ae 

and the limit as R — oo is a finite number if and only if p > 1, whereas the limit as 


R — Oisa finite number if and only if p < 1. 
The case p = | is exceptional; for 


oa 
/ —dx=I1nR 
1 x 


and neither the limit as R — oo nor that as R — O-+ is finite. 


RR 1 


1 l-p Lap 


Exercise Calculate the integrals in items | and 2 in the cases when they are conver- 
gent. 


We continue the list with 
CO 
(4) / x%e* dx. 
I 


The integral is convergent (at oo) in the following two sets of cases: if b > 0 with 
no condition on a; or, if b = 0 anda < —1. In all other cases it is divergent. 


(5) i‘ * an t)%1-P-! dt. 


The integral is convergent under precisely the same conditions as the integral in 
item 4. 

For item 5 we can set x = Int and reduce it to item 4, which we now consider 
in detail. We shall apply a comparison test for improper integrals, analogous to the 
comparison test for series. The details of this are in the next section, but for now we 
proceed intuitively. 

Firstly, if b > 0 we let 0 < c < b. Then x%e~-* tends to 0 as x — oo. There 
exists K > 0, such that x7e~@-©* < 1 forall x > K. Hence 
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for all x > K. The integral 


CO 
i e “dx 
1 


is convergent by calculation. Now we can apply the comparison test for improper 
integrals (see the next section) and conclude that the integral in item 4 is convergent. 

Secondly, if b = 0 the integral reduces to item |. Thirdly, if b < 0 we choose c, 
such that 0 < c < —b. Then lim,_,, x “e¢*”* = 0 and there exists K > 0, such 
that 


xfe* > e* forallx > K. 


oe) 
/ e* dx 
1 


diverges by calculation. Again we can use the comparison test, referring the reader 
to the next section, and conclude that the integral in item 4 is now divergent. 


The integral 


Exercise Calculate the integrals in items 4 and 5 in the cases when b > 0 and a is 
a non-negative integer. 


12.1.2 The Comparison Test for Improper Integrals 


There is a certain similarity between improper integrals and infinite series. This is 
exhibited first of all in the use of convergence tests for integrals, just as in the case 
of series. We even have a comparison test, already used in the previous section. Just 
as the comparison test for series is only applicable to positive series, the comparison 
test for integrals is only applicable to positive integrands. 

We shall look at two cases; other cases are similar and the reader should provide 
the details. In all cases we assume that f(x) > 0 for all x in the domain. The proofs, 
as in the case of positive series, are simple applications of the fact that an increasing 
function bounded above must converge to a finite limit. 


fs 


where f(x) is unbounded as x — a+, but integrable on [a + ¢, b] for each e > 0. 

The comparison test for case A assumes that we have another function g on [a, b], 
that is bounded, positive and integrable on [a + e, b] for each ¢ > 0. As in the case 
of positive series there are two modes of comparison: ordering of functions, and limit 
comparison. The conclusions are as follows: 


Case A. The integral 


12.1 Unbounded Domains and Unbounded Integrands 401 


(1) If f(x) < g(x) in Ja, b] and fg g is convergent (at a), then ns f is also conver- 
gent. 

(2) If g(x) < f(x) in Ja, b] and ye g is divergent (at a), then fe f is also divergent. 

(3) If lim,_.21 f(«)/g(x) exists and is neither 0 nor ov, then either both integrals 
are convergent or both integrals are divergent. 


[o.e) 
ae 
a 
where f is integrable on [0, L] for each L > 0. 
The comparison test for case B assumes that we have a function g on [a, o[, 


positive and integrable on [a, L] for each L > 0. There are three conclusions, as 
follows: 


Case B. The integral 


(1) If f(x) < g(x) in [a, oof and fg is convergent (at 00), then f° f is also 
convergent. 

(2) If g(x) < f(x) in [a, oof and f ~ g is divergent, then J . f is also divergent. 

(3) If lim... f(x)/g(x) exists and is neither 0 nor ov, then either both integrals 
are convergent or both integrals are divergent. 


Limit comparison, that is, the third conclusion in both cases, is very useful as the 
limit can often be found after guessing a suitable comparison function g. The key 
examples given earlier provide a first catalogue of prospective comparison functions. 


12.1.3 Exercises 


(1) Test the following improper integrals for convergence. In the first two a compar- 
ison function is suggested. After that you are on your own. 


dx ; comparison function 1/./x. 


1 
1 
”) ess 
| 
b ieee 
(®) / xJ/1+ x2 
1 /x+i1 
(c) / Neate 
1 


1 
ic / x2(1 + x?) ae 


dx ; comparison function 1/x?. 


x+1 
(e) el 
: 1 
———\——. dx (improper at both ends 
© ae Gmprop 
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2 Je 1 
(g) es 


2 
ee ae 
(h) / eat 
1 


(1) i x?(1 — x) dx (depends on p and q; improper at both ends for some 


dx 


0 
values). 


o.e) 
Qj) / e~*x?—! dx (depends on p; improper at 0 for some values). 
0 


(k) / e dt. 


Note. The integral has the value ./7r, see Sect. 12.2 Exercise 2. It is immensely impor- 
tant in probability and statistics. See also Sect. 11.7 Exercise 5 and also this section, 


Exercise 6. 


oe 2 
(1) / e 100x—x dx. 
0 


. Determine for which values of a and b the integral 


1 
: x| In x|? dx 
0 


is convergent at 0. 


. Show that if the integral is ~ f is convergent at oo then 


loo) 


lim | f=0. 


xX—>0O x 


. Give an example of a positive function /, such that the integral i ~° f is conver- 


gent at oo but f(x) does not tend to 0 as x > oo. 


. (a) Let f be a function, differentiable in an interval A that contains —1 and 1. 


Show that the limit 


—€ 1 
lim ( F@) ax+ [ £@) a) 
e> 0+ =i x € Xx 


exists and is a finite number. 


(b) Let g be a function, twice differentiable in an interval A that contains —1 
and 1. Suppose that g has a simple zero at 0 (that is, g(0) = 0 but g’(0) 4 0) 
and no other zero in the interval [—1, 1]. Show that the limit 


-—Eé 1 
lim (/ aa ax+ | 54) 
e>0+ \J_1 g(x) e (x) 


exists and is a finite number. 
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Note. The results of these calculations are called the Cauchy principal values of the improper 


integrals ta (f (x)/x) dx and fen (1/g(x)) dx, respectively. 
6. Forn = 0, 1, 2, ..., we define the function 


oe 3 
In (x) = pe Me® dt; 


(a) Show that for all integers n > 1 we have 


n—-l 2 
(—1)*(2k)! e*” —— (—1)"(2n — 1)! 
Jo(x) = a ai aes Dn =)! Jn(x). 
k=0 


Hint. Derive a reduction formula for J, (x) and then use induction. 
(b) Show that the series 


3 (—1)* (2k)! ew” 


Q2k+ ly 2k+1 
k=0 


is divergent for all x > 0. 
(c) (}) Show that the series is an asymptotic expansion for Jo(x) as x > 00, 
with respect to the asymptotic scale 
er 


ha) = Sq K=01,2,....). 


Hint. The definition of asymptotic expansion was given in Sect. 11.9. To 
verify the details L’Hopital’s rule can be helpful. 
(d) Show that the remainder term in item (a) is numerically less than the last 
term used in the series. 
Note. The function erfce(x) := 2Jo(x)/./7 is called the complementary error function. It is 
related to the normal error function by erfc(x) = 1 — erf(x). According to Sect. 11.7 Exercise 
5 the Maclaurin series of erf (x) converges to erf (x) for all x and so the same is true of erfc(x). 
For large x the convergence is too slow to give a practical method to calculate erfc(x) and then 
the asymptotic series can be helpful, as its terms decrease in magnitude up to around k = x, 
and for only moderate x the smallest term would seem to be very small. Most significantly 


for practical calculation, the remainder or error term is numerically bounded by the last term 
used of the series. 


12.2 Differentiation Under the Integral Sign 


In this section we take up a topic that is somewhat overdue, and which in the first 
instance is concerned only with proper integrals. In doing so we shall need to discuss 
functions of two variables. Actually we only need the notion of partial derivative, in 
its simplest manifestation. Given a function of two variables f(x, y) we shall denote 


404 12 Improper Integrals 


by 
Di f(x,y) 


the result of differentiating with respect to the first variable of f, whilst holding 
the second constant. We shall also need to differentiate twice; this is written as 
Di f(x, y), (that is, D; Dy f (x, y)). We will never need D> f (x, y), but we add that 
it is the result of differentiating with respect to the second variable whilst holding 
the first constant. This will be our furthest excursion into the analysis of functions 
of two variables. 

Suppose we can define a function by the formula 


b 
F(x) -|/ G(x, y)dy. 


What we are concerned with here is the validity (or otherwise) of the formula 


b 
F'(x) =) D G(x, y) dy. 


a 


It is possible to derive some general results on this, using properties of continuous 
functions of two variables, off bounds here. However, in practice, most problems 
of this kind are treated in an ad hoc manner, typically using Taylor’s theorem to 
estimate a remainder. We shall develop some useful approaches through a sequence 
of exercises. The result of Exercise | is quite adequate for most applications involving 
proper integrals. 


12.2.1 Exercises 


1. Suppose that [a, b] is a bounded interval, and A an open interval. Let G(x, y) bea 
function defined for x in A anda < y < b. Suppose that G is twice differentiable 
with respect to its first variable, whilst holding its second fixed, and that there 
exists a constant M such that |D?G(x, y)| < M for all x in A and y in [a, b]. 
Finally suppose that for each x in A the integrals 


b b 
[ Gena and [ pice. yay 


a a 


exist. Prove that, for each x in A, we have 


d b b 
=| Gor yydy= | D,G(x, y) dy. 


Hint. Use Taylor’s theorem to estimate G(x +h, y) — G(x) —hD,G(x, y). 


12.2 Differentiation Under the Integral Sign 405 


2. Define the functions 


x 2 
f= (/ et ar) 


1 -eP (P+) 
acy = [ ——— dt. 
0 


and 
r+) 


(a) Show that g’(x) + f’(x) = 0 for all x. 
(b) Deduce that g(x) + f(x) = 7/4. 
(c) Prove that 


ee 2 
/ e! dt=Jn. 


(oe) 


. Let = 
f(x) = / e® cos(xt) dt. 


(oe) 


Show that 2 f’(x) + xf (x) = 0. Deduce that f(x) = Jre*l4, 

Hint. Justify the differentiation under the integral sign by using Taylor’s theo- 
rem. The simple differential equation can be treated like the one in the proof of 
Proposition 11.17. 

. Let f have derivatives of all orders in an interval A, let c be a point of A and 
suppose that f(c) = 0. Show that 


1 
fx) =@a- of f(tx+0- t)c) dt. 
0 


Deduce, using differentiation under the integral sign, that f(x)/(x — c) extends 
to a function in A having derivatives of all orders. 

Note. The proof of the formula should only need the first derivative of f and its continuity. If 
f has derivatives up to order m then we know, by Sect.5.8 Exercise 13, that f(x)/(x — c) has 
derivatives up to order m — 1 (including at c). However, it is quite problematic to obtain the last 
derivative by the method of the present exercise. 


12.3. The Maclaurin—Cauchy Theorem 


Improper integrals provide a useful convergence test for positive series, possibly 
the most useful after the ratio test. It underscores the intimate connection between 


integrals and series. 


The integral test. Let f : [1, co[— R be positive and decreasing. Then the integral 


i : f is convergent if and only if the series }°*° , f (n) is convergent. 


As an example we can consider the series 
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It is convergent if and only if, either a > 1, ora = 1 and b > 1. This is because the 
change of variables t = In x gives 


love) dx oo e(l-ayt 
[ws-[ ae 
g x*(nx)? Jing 2? 
and the integral obtained is convergent under the selfsame conditions. 


The integral test results from the simple observation that )~;_, f (k) is an upper 


sum for the integral /, roe using a partition by integers, whilst }°7_, f(k) is a lower 


sum for the integral / i f. Hence 


n+l n n 
/ reDswsso+ f f. 


Letting n — oo we obtain 


/ fe Dswss0+ f f 


where we are allowing the value oo for one or more of the limits. The comparison 
embodied in these inequalities is often useful. 

The integral test is also an immediate consequence of a most striking result, 
that makes a sharper comparison between the sum and the integral under the same 
conditions as the integral test. 


Proposition 12.1 (Maclaurin—Cauchy theorem) Let f : [1, co[— R be positive and 
decreasing. Then the limit 


L= lim (drm - fr) 
n—->oo kal 1 
exists and0 < L < f(A). 


Proof Let 
p(n) = rre-f i 
k=1 


Then 
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Fig. 12.1 Maclaurin— y 
Cauchy. Picture of the a 
proof Sum minus integral equals 
at+bt+te+d+etft--: 
0 ol x 


n—-1 n—-1 k4+1 
o(n) = f+ >> fet v-> | Z 
k=1 


k=1 
k+1 


n—-1l 
= fil) -yf (Fo) — fe + D) de. 
kai UR 


Each term in the final sum is positive. We conclude that #(m) is decreasing and 
o(n) < fC). But we also have 


n—-1 n—-1 k+1 
om=f0+ > rH-L fs 
k=1 k=1 


n—1 


k+1 
= f(n)+ par (f(k) — f(x) dx. 
ka’ 


Again each term in the final sum is positive, and therefore d(n) > f(n) => 0. We 
conclude that the limit L = lim, @(m) exists, andO < L < f(1). 


The proof is strikingly illustrated in Fig. 12.1. 


12.3.1 The Euler—Mascheroni Constant 


An important and striking example of the Maclaurin—Cauchy theorem is provided 
by setting f(x) = 1/x. We conclude that 


“1 
Y = Jim (op ) 


exists and 0 < y < 1. A more precise value is y = 0.5772156649.... As we shall 
see, this number crops up in unexpected places in analysis and remains somewhat 
mysterious. It is still not known whether it is rational or irrational. 
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12.3.2. Exercises 


1. Revisit Sect. 3.8 Exercise 10. 
2. Study the following series and draw conclusions using the Maclaurin—Cauchy 


theorem: 
= 

(a) Dey 
ae | 

b 

(b) 2 ninn 
00 1 

"7 > n+l 
n=1 


3. The series 


yy (12.1) 
2B n?(1 + nx?) ; 


n=1 


was studied in Sect. 11.2 Exercise 6. It turned out the series was uniformly con- 
vergent with respect to the whole of R in the case that p > 5. If0O<p< 5 the 
series remains pointwise convergent. Prove that in this case the series fails to be 
uniformly convergent. Use the following steps: 


(a) Show that for each x > 0 we have 


oS 0° 
x x 
) —__—_— > ———— dt. 
n?(1+nx?) , t?(.+ tx?) 


n=1 


(b) Show that 


oo x x2P-1 
1 dt > ; 
1 tP(1 + tx?) p(l + x2)? 


(c) Let f(x) be the sum of the series (12.1). Show that f cannot be continuous 
atO0if0 < p< 5. 
(d) Conclude that if0 < p< 5 the series, though pointwise convergent, is not 


uniformly convergent with respect to any interval that contains 0. 
3. Show that 
ex—-E] 
y=l- —,— dx. 
1 


Hint. Use Sect. 8.1 Exercise 12. 
4. Recall the Riemann zeta function (Sects 11.4 Exercise 30 and 11.7 Exercise 15): 
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[oe 
C(s)= ye n-. 
n=1 


Show that for s > 1 we have 


ee 
{w=5+ 478 (s) 


F(s)= ae — [x] - 5) dx. 


Prove that the function F'(s) is differentiable for s > 0. 

Hint. Use Sect. 8.1 Exercise 12 for the formula. To obtain differentiability at a 
given s > | Taylor’s theorem can be helpful after guessing what the derivative 
ought to be. 


where 


12.4 Complex-Valued Integrals 


Complex-valued functions of a real variable have appeared in a desultory fashion 
in the present text, never having a section of their own. We recall Sect.9.2 (under 
the heading “Logarithm of a complex number’) where differentiation of such func- 
tions was considered, in particular we differentiated Log (x + ia) and (x + ia)”. In 
Sect. 11.4 we studied the function e’*, central to the unification of circular functions 
and the exponential function, and later in the same section we differentiated x“ for 
a complex power a. 

A complex-valued function f of a real variable can be expressed as f = u +iv 
where u and v are real-valued functions. Concepts such as continuity, boundedness 
and limit can be defined for them by reference to the real and imaginary parts, just as 
we defined differentiation in Sect.9.2. Thus f is said to bounded when wu and v are 
bounded, continuous when wu and v are continuous, and we can define lim,_,. f(x) 
to be lim,_,. u(x) + 7 lim,_,, v(x) provided the two limits on the right exist and are 
finite. 

A more satisfactory way to extend these concepts to complex-valued functions is 
to use the modulus of a complex number as a metric assigning a distance between 
two points in the complex field. Although metrics are definitely not intended to be 
part of this text, this is what we did when defining the limit of a complex sequence 
in Sect. 10.1. 


Exercise Let f : A > C where A is a real number interval. Reformulate the above 
definitions in terms of the modulus of a complex number as follows: 
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(a) Let c € A. Show that f is continuous at c if and only if it satisfies the following 
condition: for all ¢ > O there exists 6 > 0, such that | f(x) — f(c)| < e for all 
x in A that satisfy |x —c| <6. 

(b) Show that f is bounded if and only if the function | f'| is bounded. 

(c) Let c € A and let £ be a complex number. Show that lim,_,. f(x) = ¢ if and 
only if the following condition is satisfied: for all ¢ > 0 there exists 5 > 0, such 
that | f(x) — £| < e for all x in A that satisfy 0 < |x —c| <6. 


We turn our attention to integrals. Let f : [a,b] — C and set u = Re f andv = 


Im f. We define 
fara furife 


if both integrals on the right-hand side exist. The definition is then extended to 

improper integrals; so for example ie f is said to be convergent when the integrals 
o.e) o.e) 

Jy wand J,” v are both convergent and then we set 


[r= [uri fe. 


It is easy to prove that the integrals of complex-valued functions satisfy similar 
rules to those obeyed by real functions as regards the sum of two functions, and 
the product of a function by a complex scalar. In other words integration is a linear 
operation over the complex numbers. Moreover the fundamental theorem in its sim- 
plest manifestation, Proposition 6.17, extends easily to the case of complex-valued 
functions, as does also the rule for integration by parts. 

Most importantly we can use Cauchy’s principle for a complex integrand, simply 
by reinterpreting the absolute value as the modulus. The integral i f is convergent 
if and only if the following condition is satisfied: for all ¢ > 0 there exists K > 0, 
such that if f| < ¢€ for all x and y that satisfy K <x <y. 

The extension of Cauchy’s principle to the case of a complex integrand is a 
straightforward exercise left to the reader 


Proposition 12.2 Let f : [a,b] > C be integrable (that is Re f andIm f are both 
integrable). Then | f | is integrable and 
< fir 


f 


Proof The proof that | f| is integrable is left to the reader (it may help to recall 
Sect. 6.4 Exercise 13). 

Let f =u+iv and f f =A=a+ib \y, v real functions, a, b real numbers). 
Now |A|? is a real number, so that, using the Cauchy—Schwarz inequality (Sect. 2.2 
Exercise 17), we find 
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aP=Af p= [dusidr= [au +ov) (it’s real!) 
< fe +e ytur 40%)! = 141 f fl 


We conclude that |A| < f | f]. 


The preceding proposition is such a basic tool, and is used so often, that we rarely 
refer to it in justification. 


12.4.1 Absolutely Convergent Integrals 


We can learn about the integral oa Ff by studying the integral ie fl. 


Proposition 12.3 Let f : [0, co[—> C be integrable on [0, L] for all L > 0. If the 
integral iB | f | is convergent then so also is the integral rp f. 


The proposition applies, with obvious modifications, to other types of improper 
integrals. 


Proof For 0 < x < y we have 


[[r- for |- 1 < fifi 


Assume that the integral ee | f| is convergent. Let ¢ > 0. By Cauchy’s principle 
there exists K, such that 
y 
i Ifl<eé 


for all x and y that satisfy K <x < y. But for the same x and y we then have 


oe 


by the displayed inequality, and Cauchy’s principle tells us that the limit 


<€ 


lm | f 


x—>0O 0 


exists and is finite. 


The proposition motivates a definition, analogous to the case of infinite series. 
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Definition If the integral i | f| is convergent, then the integral i, f is said to be 
absolutely convergent. 


In analogy with series, and because it is convenient to have a term, we can call 
integrals that are convergent, but not absolutely convergent, conditionally convergent. 

Absolutely convergent integrals are generally easier to handle than conditionally 
convergent ones, because integrals with positive integrands are easier to estimate. 
In contrast, proving that an integral is conditionally convergent can be tricky. It is 
striking, and the reader will be shown many examples in the material to follow, that 
two tools often prove useful in conjunction with studying conditionally convergent 
integrals. They are the second mean value theorem for integrals (see Sects.6.8 and 
8.1 Exercise 15) and Cauchy’s principle. 


12.4.2. Exercises 


1. Show that a complex integral i f is absolutely convergent if and only if the 
two real integrals f Re f and f Im f are absolutely convergent. 
2. Solve the following indefinite integrals, given that the constant a may be complex: 


(a) / e* dx 


(b) poe dx, 
where mm is a positive integer. 


1 
c ——— dx, 
” / (x +a)” 
where m is a non-negative integer and Ima # 0. 


(d) [rove + iar. 


4, Solve the integral { x” cos(Ax) dx (where A is real) by exploiting the formula 
cosx = Ree’*. 

5. Show that the integral i. e ?* dx is absolutely convergent for all complex p 
such that Re p > 0, and evaluate it. 

6. Show that the integral [°° e~@+! » dx is absolutely convergent. 


7. Show that the integral 
/ © sin x 
—— dx 
0 Xx 


is convergent (at oo) but that the integral 
/ ° | sin x 
0 


xX 


is divergent. 
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Hint. For the first claim one approach is to integrate by parts. Note that the integral 
is not improper at 0. 

8. (©) Extend HGlder’s inequality to improper integrals (see Sect. 6.4 Exercise 10). 
Let f and g be functions defined in [0, oo[, integrable on every interval [0, L]. 
Let p and q be positive numbers that satisfy 


1 1 
—+-=1. 
Po 4 


Suppose that the integrals a | f |? and i |g|% are convergent. Then the integral 
i fg is absolutely convergent and 


[ Ifgl < (Pir) "(P at") 


In the next three exercises we shall explore the improper integral he fg. The 
results are convergence tests that resemble the variants of Dirichlet’s test for series 
given in Sect. 11.4. The proposed method is to obtain convergence at oo by using 
the second mean value theorem for integrals and Cauchy’s principle. Recall that the 
second mean value theorem asserts that, under the conditions that f is monotonic 
and g is real valued and integrable on [a, b], there exists & in [a, b], such that 


[-sof erro fie 


In full generality this was proved in Sect.6.8, and is the reason for marking the 
exercises with the nugget symbol. Under stronger conditions, for example if f’ is 
continuous and positive (alternatively negative), and g continuous, easier proofs of 
the second mean value theorem were suggested in the exercises in Sect.8.1. The 
reader who has not studied Sect. 6.8 might prefer to adopt these stronger conditions 
on f and g rather than skip these exercises. 


8. ()) Let f and g be functions with domain [0, oo[, and assume that f is monotonic 
and that g, which may be complex valued, is integrable on each interval [0, L]. 
Suppose that there exists a constant K > 0, such that 


i 
0 


for all x > 0, and suppose that lim,_,.. f(x) = 0. Show that the integral i fg 
is convergent. 

9. (Q) Let f and g be functions with domain [0, co[, and assume that f is monotonic 
and that g, which may be complex valued, is integrable on each interval [0, L]. 


<K 


414 12 Improper Integrals 


Suppose that the integral i g is convergent and that f is bounded as x — oo. 
Show that the integral jee fg is convergent. 


The restriction in the second of the above two convergence tests to monotonic and 
bounded f can be weakened. Obviously it is enough if f can be expressed as a linear 
combination of functions that satisfy this condition. It turns out that a large class of 
functions can be expressed as the difference of two bounded monotonic functions, 
and requiring this of f, along with the stated conditions on g, clearly suffices to 
guarantee convergence of the integral te fg. This is the content of the following 
exercise. 

In the third item of the exercise we extend the test to allow complex-valued /. 
Although it could totally supplant the test given in Exercise 9, the latter is still very 
useful because of its relatively simple and memorable conditions. 


10. Prove the following claims, of which the third is a convergence test for the 
integral ie fg to stand alongside Exercises 8 and 9. 


(a) Let f be real valued with continuous first derivative, and assume that the 
integral i | f’| is convergent, and that f is bounded as x — oo. Then there 
exist functions g and h, both increasing and bounded as x — oo, such that 
f=g-h. 

Hint. Take g(x) = i [f’|. 

(b) Let f be complex valued with continuous first derivative, and assume that 
the integral Sor | f’| is convergent, and that f is bounded as x — oo. Then 
there exist real-valued functions g1, g2, 4; and ho, all increasing and bounded 
as x —> oo, such that f = (g; — g2) + i(hy — ho). 

(c) Let f and g be complex-valued functions with domain [0, oo[, assume that g 
is integrable on each interval [0, L], that f’ is continuous and f is bounded 
as x — oo. Assume that the integrals i g and or | f’| are convergent. 
Then the integral i fg is convergent. 

Note. It is possible to go even further in weakening the conditions on the function f. Some 

clues to this may be found in Sect. 6.5. 


11. Here are some examples that illustrate the tests given in the preceding exercises. 
Show that the following integrals are convergent: 
(a) i © sin x 4 
a —— dx 
0 vx 
°° sin x tanh x 
(b) ————_ dx 
0 


ed 

(c) i sin x tanh x sin(1/x) ae 

0 Vx 

°° sin x tanh x sin(1/x) 

(d) —— dx 

5 Jx (/x +i) 
Note that all four integrals are proper at 0, in spite of the occurrence of 1/,/x 
and 1/x. 
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12. (a) Show that the function F(x) = x — [x] -— 5 is periodic with period | and 
mean 0. For helpful information concerning periodic functions consult 
Sect. 6.5 Exercise 5. 
(b) Show that the integral 


is convergent. 
13. (©) Let 


(see the previous exercise). Use Sect. 8.1 Exercise 12 to show that 


1 
B= lim (iniay _ (n+ 5) nnn _ i) : 
noo 2 


Deduce that B = $In(27) — 1. 

Hint. Use Stirling’s approximation to n! (see the eponymous nugget). 

Note. This exercise can also be viewed as a proof of Stirling’s approximation, somewhat shorter 
than the one given in Sect. 11.10. If regarded as such then one must determine B without using 
Stirling’s formula; using, for example, Wallis’ product for 2, as was proposed in Sect. 11.10 


in order to determine the number A, related to B by B = In A — 1. 


In the remaining exercises of this section we shall study a remarkable result: the 
Euler—Maclaurin summation formula. The reason for marking these exercises with 
the nugget symbol is the important and surprising role played by the sequence of 
Bernoulli numbers (B,,)°°.9. These are defined so that the numbers B,,/n! are the 
coefficients in the Maclaurin series of the function x/(e* — 1), which is extended 
to have the value 1 at x = 0. The Bernoulli numbers were studied in Sect. 11.8; the 
reader is advised to turn back some pages and read about them, if they have not 
already done so, before proceeding. 


14. (Q) Let f be infinitely-often differentiable in an interval A. Let a and b be 
integers in A such that a < b. Ina sequence of exercises the reader is invited to 
prove the Euler—Maclaurin summation formula 


b b b m B 
ys) =f rs f(a + fb) > (f(b) — f&P@) + Ras 
k=a eC . 


2 k=2 


where the constants B; are universal (that is, independent of f, a, b and m), 
and R,, is a remainder term. The constants will be identified with the Bernoulli 
numbers, and the remainder elucidated, in the course of the proof. 
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We begin by recalling the formula 


b b b 
ww= | pen | («-pa-5) seas 


which was given in Sect.8.1 Exercise 12. This will be the case m = 1 of the 
summation formula. 


(a) Let Fi (x) = x — [x] - 5. Show that there exists a unique sequence of func- 
tions (F,,)°°.,, in which each function is periodic with period 1 and mean 0, 
and F,,1 is a primitive of F,, forn = 1, 2,.... 

Hint. Consult Sect. 6.5 Exercise 5. 
(b) Show that the summation formula holds with B, = (—1)*k! F;,(0), fork > 2, 


and 


b 
Rin — co f Ei f™: 
a 


Hint. Consult Sect. 8.1 Exercise 16. 

(c) Show that B, (for k > 2) is the kth Bernoulli number. 
Hint. Consider the case f(x) = x” and compare with the nugget on 
Bernoulli numbers. 


Note that actually the Bernoulli numbers B, satisfy By = k!F;,(0), for k > 2, in 
addition to B, = (—1)*k!F;,(0), because in fact, as we saw in Sect. 11.8, the Bernoulli 
numbers B, with odd k are 0 (except for B,). This observation can prevent much 
anguish caused by the appearance of unwanted minus signs. 

It also turns out that the function F,, (x) in the Euler—Maclaurin summation formula 
is a polynomial function of degree n of the periodic function x — [x]. Define the 
functions P,,(t) in terms of the coefficients in the Maclaurin series 


t lone) 
Prt), 
ps n! 7 


n=0 


e* 


Xx 
ex 


as shown here. The series converges in an interval ]—r, r[, the same as the one in 
which the formula 


is valid. This is studied in the next exercise. 


15. (a) Show that 


n 


P,(t) = > (i) B,t"*. 


k=0 
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16. 


17. 


In particular P, (0) = B,. 
(b) Show that P/(t) = nP,-\(t) forn = 1, 2,... 
(c) Show that P, (1) = P,,(0) forn = 2, 3,.... 
(d) Show that the function F,, in the summation formula is given by 


Pax — x) 


Fi, (x) = nl 


Suppose that for each j > 1 we have lim,_..0 f(x) = 0 and the function f” 
is monotonic. 


(a) Prove, in the notation of the Euler-Maclaurin summation formula, that for 
each m we have 


m 


i (s-t1- sf @ar=— EP @ HED / Fn f™. 
- a ¢ 


(b) Show that the summation formula can be written as 


2 b By OR 
Dw =f f+ hore 4 =f!) 
k=a a : 


k=2 


* a _t U — (_1ym-1 m (m) 
+f (: [x] 5) feds (-1) / Fy f™. 


This can be useful because the last term tends to 0 as b > ov. 


(Q) Prove the following generalisation of Stirling’s approximation (see 
Sect. 11.10). For each m we have, as n > co: 


m 


fen ore CUB oeg-m 
Inn = min (Z) + 5mm + mn) +> pg +O ). 


The formula given here is an asymptotic expansion for In(!). The asymptotic 
scale that appears here is pretty, comprising the sequence of functions 


ninn, n, Inn, 1, -, =, = 

non 

exhibiting progressively slower “growth” as n — oo. For an explanation of the 

big-O notation and asymptotic expansions see the nugget “Asymptotic orders of 
magnitude”. 
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12.5 (©) Integral Transforms 


In this section we shall look at the Fourier transform and very briefly at the Laplace 
transform. These are examples of integral transforms, and are immensely important 
in applications, both within mathematics, and to science and technology. The reason 
for including them in this text is that they probably constitute the types of improper 
integral that the reader is most likely to encounter. Their most important property, 
that they can be inverted, will not be touched upon; that belongs to the area of further 
study. We limit the discussion mainly to convergence of the integral, and properties 
of the transformed function, chiefly continuity, differentiability and decay at infinity. 
The conclusions will be developed as exercises. 

Anintegral transform is used to transform a given function f into a second function 
F using the prescription 


b 
F(y) -|/ K(y, x) f(x) dx 


a 


where the function K, of two variables, is called the kernel of the transform. The 
integral may be improper, as is the case for both the Fourier transform (b = oo = —a) 
and the Laplace transform (a = 0, b = oo), both considered here. 


12.5.1 Fourier Transform 


The function 


F(y) = / F(aye? dx 


is called the Fourier transform of the function f. It can be defined when f is a 
complex-valued function and the integral is convergent at both ends. 

A simple sufficient condition for the existence of the Fourier transform, and the 
starting point for all studies of the Fourier transform, is that there exists MW, such that 


fuinsm 


for all R > 0. A function f that satisfies this is sometimes called absolutely inte- 
grable. It simply means that the integral (ae f is absolutely convergent at both 
ends. By Proposition 12.3 the integral defining the Fourier transform is absolutely 
convergent for every real y. 

The Fourier transform is just one of many ways, though perhaps the most impor- 
tant, to transform one function into another using an integral transform. It is typically 
applied to find solutions of differential equations defined over the whole line. 
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12.5.2. Exercises 


1. Calculate the Fourier transforms of the following functions: 


(a) f(x) = 1 fora <x <b, f(x) = 0 otherwise. 
(b) f(x) =e™ forx > 0, f(x) = 0 for x < 0, where a > 0 and is a constant. 
(c) f(x) = e~@!, where a > 0 and is a constant. 


2. (a) Let f(x) be areal odd function and let F(y) be its Fourier transform. Show 
that 


2 [o.e) 
F(y)= -f f(x) sin(xy) dx. 


(b) Let f(x) be areal even function and let F (y) be its Fourier transform. Show 
that 


FO) = a f (x) cos(xy) dx. 
0 


Note. The integrals Toe Ff (x) sin(xy) dx and Jon Ff (x) cos(xy) dx, which can be defined for a 
function f given on the interval [0, oo[, are called the Fourier sine transform and the Fourier 
cosine transform of f. 

3. Prove that the Fourier transform Fy) of an absolutely integrable function f is 
continuous and tends to 0 at oo and —oo. This can be done in the following steps. 
We let ‘ 

F,(y) = / e' f(x) dx 


n 


for each natural number n. 


(a) Show that F,,(y) is a continuous function of y for each n. 
Hint. Use the equality |e’ — 1| =2| sin(t/2)| and the inequality | sin t| < |r]. 
(b) Show that lim,-... F,(y) = F(y) uniformly with respect to y in R. Deduce 
that F is continuous. 
(c) Show that limy_,+5. Fn(y) = 0 for each n. 
Hint. Prove this first in the case that f is a step function. If f is merely 


integrable it may be approximated in the mean by a step function. 
Note. In fact 


b 
lim e’ f(x) dx =0 


yoxroo Jaq 


for any function f integrable on [a, b], a result known as the Riemann—Lebesgue lemma, 
although this name is sometimes attached to the conclusion that limy.+0 F(y) = 0. 


(d) Deduce that lim,_,+.. F(y) = 0. 
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12.5.3 Laplace Transform 


The Laplace transform is an important tool in applied mathematics, particularly for 
solving linear differential equations in which time is the independent variable, and 
solutions that satisfy initial conditions are sought forward in time. We will not go 
into these applications here; they properly require another setting than the present 
text. 

Suppose that f is a real-valued function with domain [0, oo[, integrable on the 
interval [0, L] for every L > 0. We define the Laplace transform of f by the integral 


F(p)= / eP* f(x) dx 
0 


the understanding being that F'(:p) is defined for all p such that the integral is con- 
vergent. It is usual to study the Laplace transform for complex p, and only then can 
its properties be fully appreciated. However, in order to remain within the confines 
of fundamental analysis we shall restrict our attention to real p. 


12.5.4 Exercises (cont’d) 


4. Calculate the Laplace transforms of the following functions: 


(a) x”, where n is a non-negative integer. 
(b) e**, where k is a real constant. 
(c) cos kx, where k is a real constant. 
(d) sinkx, where k is a real constant. 
(e) f(x) = 1,ifa <x <b, and otherwise f(x) = 0. 
Note. The case of the last function with b = oo is called the Heaviside unit step at x = a. 


It plays an important role in technology. 


5. In all items of this exercise we assume that f is integrable on [0, L] for each 
L > 0, and that the integral ie f is convergent. 


(a) Show that the integrals [5° e~?* f(x) dx and f° e~?*xf (x) dx are both 
convergent for all p > 0. 


We let 43 
Fp) = | eP* f(x)dx, (p> 0) 


Gp =— [> em asear, (p > 0) 
0 


and for each positive integer n we let 
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F,(p) = [ eP* f(x) dx 
0 


Cees i "oP f (x) de. 
0 


(b) Show that G,(p) = F/(p). 

(c) Show that lim,.. G,(p) = G(p) uniformly with respect to the interval 
6 < p < ow forany givend > 0, whereas limy_,oo Fn(p) = F(p) uniformly 
with respect to 0 < p < oo. 

Hint. Use Cauchy’s principle and the second mean value theorem. 

(d) Show that F'(p) is differentiable for p > 0 and F’(p) = G(p). 

(e) Show that F(p) has derivatives of all orders for p > 0 and they are given 
by 


F®(p) = i * ake P* f(x) de. 
0 


(f) Show that lim)... F(p) = 0. 
(g) Deduce also that 


lim Pf (x) dx = ie. 


p>0+ 0 


Note. The last conclusion here is an Abelian theorem, comparable to Abel’s theorem on 
power series. Imagine that a value is assigned to possibly divergent integrals Is” fa 
kind of summability method (see Sect. 11.5), by computing the limit 


foe) 
lim e P* f(x) dx, 
p>0+ Jo 


if it exists. The exercise shows that the correct value is assigned to already convergent 
integrals. 


6. Prove that 


Hint. Let 


compute F’(p) for p > 0, and obtain an explicit formula for F(p). 
7. Use the result of the previous exercise, along with some trigonometric identities, 
to derive the following results: 


°° sin x cos x IT 
(a) dx = 
0 


422 12 Improper Integrals 
© sin* x 
(c) dx = 
0 Xx 


2 
© ginty 
(d) dx = 
0 Xx 


12.5.5 Pointers to Further Study 


wla Al A 


Fourier analysis 
Laplace transforms 
Differential equations 


are 
~s! 
aa 
— Complex analysis 


12.6 (©) The Gamma Function 


In this nugget we study the Gamma function, arguably the most important of the 
special functions, and the first to be studied historically. The Gamma function has a 
way of appearing, as you might say, unexpectedly, in formulas; in particular it is a 
constituent of some important special functions, notably Bessel functions. There is 
no better way to begin studying special functions than by learning about the Gamma 
function. 

We shall develop some of its properties through a series of exercises, using only 
the tools made available in this text. The further study of the Gamma function requires 
methods that go beyond the fundamental analysis of this text, such as multiple inte- 
grals and complex analysis. It is possible to prove some of the properties, normally 
obtained with ease by more advanced methods, using only fundamental analysis. This 
requires much ingenuity and effort, and one may ask what is achieved by demon- 
strating that more powerful methods can be avoided. 

The Gamma function I(x) extends the function f(n) = (n — 1)! from the 
positive integers to the real numbers (with the exclusion of the negative inte- 
gers and 0), while preserving the characteristic property of the factorial function, 
T(x) = (« — 1) @ — 1). It has been defined in many different ways, but by far the 
simplest is to use the so-called Eulerian integral of the second kind. This defines 
I(x) for x > 0 by the integral 


CO 
T(x) = ip ter ae. 
0 


The integral, whilst obviously improper because of its upper limit, is also improper 
at Oifx < 1. 
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12.6.1 Exercises 


1. Show that the integral defining the Gamma function is convergent at both ends if 
x > 0. 
2. Show that for x > 1 we have 


Ta) = — DP —-1) 


and deduce that 
T'(n) =(n—-1)! 


if n is a positive integer. 
Note. The property proved here can be used to extend the Gamma function to negative values 
of x, except at negative integers, by setting 


T(x +m) 


T(x) = 
(x +m — 1)\(x +m — 2)...x 


using any integer m, such that x + m > 0. It does not matter what integer is used; the same 
value is obtained by using m + | as by using m. 

3. The Gamma function can be used to “tidy up” the coefficients in the binomial 
series. Let a be a real number. Show that 


ga— 1), iD) 3 T(a+ 1) 
k! ~Tk+EDI(a—k+1) 


4. Show that r(3) = ./m. Deduce that for all natural numbers n we have 


r(n+5) me UIE 


2 4”"n! 


5. Show that '(x) has derivatives of all orders and its nth derivative (for x > 0) is 
given by the formula 


oe) 
r(x) = / tl! dnt)"e" dt. 
0 


Hint. You will have to justify the repeated differentiation under the integral sign. 
One possibility is to use Taylor’s theorem to estimate the quantity 


peth Loy 1 _ pyr ‘int, 


but you will have to cope with the fact that the function t*—!, regarded as a function 
of x, is decreasing if 0 < t < 1 and increasing if t > 1. 
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6. (a) Show that for x > 0 the Gamma function can be written as the sum of two 


series: - 
(—1)” 1 
ra) 7 et 
n=0 n=0 
where o 
C= — t'(nt)"e~ dt. 
n!} 1 


Hint. Split the integral into two: from 0 to 1, and from | to oo. Expand 
the integrands in power series and argue that the order of integration and 
summation can be interchanged. This is tricky for the improper integral and 
is a good example of an argument that is much easier to carry out using the 
Lebesgue integral. 

(b) Show that the power series }°”° 9 cnx” has infinite radius of convergence. 


(c) Show that the series 
3 (= 1)" 1 
n! n+z 


n=0 


converges for all complex z, except for z = 0, —1, —2, ..., and that its sum 
function, restricted to the real line, has derivatives of all orders on the real 
line R minus the set {0, —1, —2, ...}. 


We revisit the example treated in the nugget on asymptotic orders. It has a tenuous 
connection to the Gamma function. 


7. (Q) Recall the function 


se 
Eq) = [ 7e */t dt, (x > 0) 
0 


that was studied in Sect. 11.9. 
(a) Show that 
CO e# 
ea) = | —du, (x>0). 
x u 


So —£E (x) is an antiderivative for the function e~*/x; in fact the one that 
tends to 0 at oo. It solves the troublesome integral f e~* /x dx. 
(b) Show that for all x > 0 we have the series expansion 


(oe) 


E(x) =C —Inx > 


n=1 


(—1)"x” 


nn! 


where C is a certain constant. 
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(c) Show that 


C= [ane jie T , 
0 


Note. T'(1) is known to be —y, where y is the Euler-Mascheroni constant. See 
Exercise 8. 


(d) The series in item (b) is convergent for all x. For x = 10, estimate the 


size of the 10th term, and conclude that it is far better to use the asymptotic 
expansion to 10 terms (see Sect. 11.9) than the convergent series to 10 terms. 


8. We conclude with some final spectacular results concerning the Gamma function: 


(a) Let (1/p) + C/q) = 1, p > 1, q > 1. Show that 


r(= ~ *) ar@y?rgy, 
P 4 


Hint. Use Hélder’s inequality, Sect. 12.4 Exercise 7. 


(b) Show that In (x) is convex on the interval ]0, oo[. 
(c) Show that for all 0 < x < | and every natural number 1 we have 


xIn(n) < InP (n+x%+4+1)—-In(@v!) < xIn@ +1). 


Hint. Use (b) and compare the chords of y = In I’(x) on the intervals [n, n + 
1],{n+1,n+x+4 1) and [n+ 1,n+4 2]. 


(d) Deduce from (c) that for 0 < x < 1 we have 


(e) 


0<InT in( wn Jen pase 
a a, Cr aren ae res <xIn( +=). 


Deduce that 
n*n! 


T(x) = lim , 
noo x(x + 1)...(4 +n) 


not just for 0 < x < 1, but for all x > 0. 

Note. This limit can be rewritten to give Euler’s original definition of the Gamma function 
as an infinite product. 

The result of taking the logarithm on both sides of the limit formula for T(x) 
obtained in the previous item, followed by differentiating both sides and then 
interchanging the derivative and the limit, suggests that the following might 
be true: 


Give a rigorous proof by showing that for any K > Othe limit here is attained 
uniformly with respect to0 <x < K. 
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(g) Deduce that —I’’(1) is the Euler-Mascheroni constant y. 
(h) Prove the theorem of Bohr and Mollerup: 
Let f be a function with domain ]0, oo[ that has the following three prop- 
erties: 
(i) In f(x) is convex. 
Gi) f@=@-DFa-D, @ > Dd. 
dai) fd) =1. 
Then f(x) = T(x). 
Hint. Repeat the arguments of items (c), (d) and (e), using f instead of I. 


12.6.2 Pointers to Further Study 


— Special functions 
— Complex analysis 


Appendix 
Afterword and Acknowledgements 


Many readers may have noticed that most, if not all, of the material appearing in this 
text has been treated elsewhere, probably in dozens of publications. It is inevitable 
that bits here and there are lifted out of some previous textbooks, whether I like it 
or no. Therefore I wish to list here the four textbooks that, over more than half a 
century, have been the ones I have principally turned to when I have needed to check 
up on some detail of fundamental analysis. No recommendation is implied in this 
list and it is certain that each has its virtues and faults, neither of which do I wish 
to elaborate on here. However, it is also certain that some acknowledgement is due. 
The dates are those of first publication. 


(a) G.H. Hardy. A Course of Pure Mathematics. Cambridge University Press, 1908. 

(b) E. G. Phillips. A Course of Analysis. Cambridge University Press, 1930. 

(c) J. C. Burkill. A First Course in Mathematical Analysis. Cambridge University 
Press, 1962. 

(d) M. Spivak. Calculus. Publish or Perish, 1967. 
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Index 


A 
Abel, 2 
Abelian theorems, 361, 421 
Abel’s lemma, 350 
Abel’s theorem on power series, 352 
Abel summation, 361 
Absolute value, 15 
Absolutely convergent integral, 411 
Absolutely convergent series, 307 
Algebraic function, 237 
Algebraic number, 160 
Alternating series, 309 
Antiderivative, 195, 217, 253, 265 
Approximation in the mean, 235 
Archimedean property, 29 
Archimedes, | 
Archimedes’ theorem 

for the inscribed sphere, 230 

for the parabolic segment, 196, 230 
Arc length integral, 226 
Area, 195 
Arithmetic axioms, 7 
Arithmetic geometric mean, 57, 192, 335 
Arithmetic mean and geometric mean 

(inequality of), 56, 185 

Asymptotic expansion, 389, 403, 417, 425 
Axiom of choice, 107, 122 
Axioms for the real numbers, 7-14 


B 

Babylonian method, 189 

Basel problem, 71 

Berkeley, 2 

Bernoulli numbers, 381, 415-417 


Bernstein’s theorem, 379 

Bessel function, 258, 261 

Big O notation, 387 

Bijective function (definition of), 119 
Binomial coefficients, 18 

Binomial rule, 19 

Binomial series, 346 

Bohr and Mollerup (theorem of), 426 
Bolzano, 2 

Bolzano—Weierstrass theorem, 60, 115, 116 
Bonnet’s theorem, 234 

Bound variable, 43 

Bounded function, 103 

Boundedness theorem, 113 

Bounded sequence, 47 

Bounded set, 25, 26 


Cc 
Cantor, 2 
Cantor’s reals, 78 
Cauchy, 2 
Cauchy principal value, 403 
Cauchy product of series, 316 
Cauchy—Schwarz inequality, 20 
Cauchy’s condensation test, 73 
Cauchy’s convergence principle, 63, 121, 
306, 328, 330, 410 
Cauchy’s remainder, see Taylor’s theorem 
Cauchy’s root test, 308 
Cesaro summation, 360 
Chain rule, 136, 143 
Chebyshev polynomials, 304 
Circular functions, 132 
addition rules for, 243 
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definition of, 238-243, 356 

derivatives of, 243 

exact values of, 302 

power series for, 346 
Codomain of function (definition of), 91 
Comparison test, 66 

for integrals, 400 
Complete induction, 7 
Completeness axiom, 13, 25, 111 
Complex numbers, 291 

argument of, 297 

conjugate of, 294 

logarithm of, 299 

modulus of, 294 

nth root of, 297 

real and imaginary parts of, 292 

square root of, 293 
Complex powers (differentiation of), 349 
Complex-valued integrals, 409 
Composition of functions, 106 
Conditionally convergent series, 309 
Continued fractions, 85-89 
Continuity rules, 99 
Continuous function (definition of), 94 
Convergent sequence, 39, 47 
Convergent series, 64 
Convex function, 176, 182, 185 
Critical point, 151 
Cubic equation 

Cardano’s solution of, 303 

trigonometric solution of, 303 


D 
D’ Alembert’s test, see Ratio test 
Decimals, 20-24, 73-77 

algorithm for, 22 
Decreasing function, 104 
Decreasing sequence, 49 
Dedekind, 2 
Dedekind section, 12, 13, 25, 27 
Dedekind’s reals, 78 
Definite integral, 217 
De Moivre’s theorem, 297 
Derivative (definition of), 130 
Descartes’ rule of signs, 173, 175 
Determinant, 149 
Difference quotient (definition of), 130 
Differentiable (definition of), 130 
Differential equation, 195, 217, 249, 253, 

257-260 

Differential quotient, 142 
Differentiation 


Index 


of function sequences, 332 

of function series, 332 

of power series, 342 

under the integral sign, 404 
Differentiation rules, 134-140 
Dirichlet series, 359 
Dirichlet’s test, 351, 358, 413 
Discontinuities of monotonic functions, 105 
Divergent sequence, 40, 47 
Divergent series, 64 
Domain of function (definition of), 91 
Dyadic fraction, 182 


E 
e (Euler’s constant) 
calculation of, 255, 341 
definition of, 249 
irrationality of, 363 
Elementary transcendental function, 237 
Element (of a set), 3 
Elliptic function, 228, 259 
Elliptic integral, 269, 334 
Empty set, 3, 12 
Error estimate 
for (1 + x)%, 372, 374 
for cos x, 370 
for Indi + x), 371, 374 
for sin x, 370 
for e*, 370 
Error function, 377, 403 
Euclid, | 
Euclidean algorithm, 89, 173 
Eudoxus, | 
Euler, 2 
Euler—Maclaurin summation formula, 269, 
385, 415-417 
Euler—Mascheroni constant, 407, 425, 426 
Euler’s formula, 346 
Even function, 148 
Exponential function, 132, 148, 169 
definition of, 249 
laws of, 249, 250, 256 
with complex argument, 341 
Exponential series, 339 
Extreme value theorem, 115 


F 

Factorial, 19 

Favourite identity, 48 

Fermat numbers, 7 

Fibonacci numbers, 18, 51, 191 
Finite set, 58 


Index 


Fourier, 2 
Fourier transform, 418, 419 
Free variable, 43 
Function (definition of), 91 
Function sequence 
pointwise convergence of, 327 
uniform convergence of, 327 
Function series, 325 
pointwise convergence of, 329 
uniform convergence of, 329 
Fundamental theorem 
of algebra, 271 
of calculus, 216-218, 224 


G 

Gamma function, 422-425 

Gauss’ lemma, 12, 19 

Gauss’ test, 320, 322 

Geometric series, 23, 67 

Golden Ratio, 51 

Graph (of a function), 93 

Greatest lower bound, see Infimum 


H 

Half-angle substitution, 278 
Harmonic series, 71, 254 
Heine—Borel theorem, 234 
Hermite polynomial, 260 

Higher derivatives, 147-149 
HOlder’s inequality, 186, 212, 413 
Hyperbolic function, 252 
Hypergeometric function, 259 


I 
Implicit differentiation, 145 
Improper integral, 397 
Increasing function, 104 
Increasing sequence, 49 
Indefinite integral, 217 
Indeterminate form, 167 
Induction principle, 5 
Infimum, 26-28 
Infinite limit, 47, 101 
Infinite series, 38 
Infinite set, 58 
Inflection point, 180 
Injective function (definition of), 119 
Integers, 11 
Integrability 
of continuous functions, 202 
of monotonic functions, 203 
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of step functions, 208 
Integrable function (definition of), 199 
Integral test (for series), 405 
Integral with limits, 208 
Integration by parts (rule of), 263, 270, 271 
Integration by substitution (rule of), 264 
Integration of function sequences, 331 
Integration of function series, 331 
Integration rules 

join of intervals, 207 

multiplication by scalars, 206 

sum, 205 
Intermediate value property, 112, 117, 118, 

156 

Intermediate value theorem, 111, 172 
Intersection, 3 
Interval, 24 
Inverse function 

definition of, 119 

differentiation of, 138 
Irrational numbers, 12 
Iteration, 50, 108, 125-127, 186, 187 


J 
Jensen’s inequality, 184 
Jump discontinuity, 100 


K 
Kepler, 2 
Kummer’s tests, 321 


L 
Lagrange’s remainder, see Taylor’s theorem 
Laplace transform, 420 
Laws of exponents, 18, 20 
Least upper bound, see Supremum 
Lebesgue integral, 200 
Legendre 
function, 258 
polynomial, 261 
Legendre transform, 183, 270 
Leibniz, 1 
Leibniz’s formula, 148 
Leibniz’s notation 
for derivatives, 141, 143, 147 
for integrals, 209, 217 
Leibniz’s rule, 134 
Leibniz’s test, 309 
L’Hopital’s rule, 161-167 
Limit 
of a complex sequence, 305 
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of a function, 96 

of a real sequence, 39 
Limit at infinity, 101 
Limit inferior, 80-85, 110, 118 
Limit point (of a set), 57-60, 95, 107 
Limit rules 

for functions, 97-99 

for sequences, 52-57 
Limit superior, 80-85, 110, 118 
Liouville’s theorem, 160 
Lipschitz condition, 124, 213 
Local maximum point, 151 
Local minimum point, 151 
Logarithms 

definition of, 247, 250 

laws of, 248, 250 

natural, 248, 254 
Lower integral (definition of), 198 
Lower sum (definition of), 197 


M 

Maclaurin—Cauchy theorem, 406 
Maclaurin series, 343 

Maxima and minima, 151-153, 166 
Maxima and minima (problem of), 116 
Maximum, 28 

Maximum function, 15 

Mean (of periodic function), 220 


Mean value theorem, 113, 153, 154, 156, 


158-160, 178, 187 

Cauchy’s form of, 158 
Mean value theorem for integrals, 211 
Mertens’ theorem, 317 
Method of bisection, 112, 113 
Method of proportional parts, 378 
Midpoint rule, 285 
Minimum, 28 
Minimum function, 15 
Model railway (join in track), 149 
Monotonic function, 104 
Monotonic sequence, 49 
Most beautiful formula, 346 
Multiplicity (of a root), 171 


N 

Natural numbers, 3-6, 10 

Newton, | 

Newton’s method, 190, 191 
Non-Archimedean field, 32 

nth root function, 17, 113, 120 
Numerical differentiation, 170, 378 
Numerical integration, 224, 284 


Oo 

Odd function, 148 

One-sided limits, 99 

Ordering axioms, 9 

Oscillation (of a function), 123, 212 
Ostrogradski’s method, 281 


P 


Index 


Partial fractions decomposition, 272-275 


Partial sum, 64 
Partition (definition of), 197 
Periodic function, 110, 220 
Period of pendulum, 333, 335 
a (Archimedes’ constant) 

calculation of, 357 

definition of, 239, 356 

estimate of, 280, 341 

irrationality of, 364 

series for, 350, 353 
Piece-wise continuous function, 218 
Postal charges, 92 
Power series, 336 

for arcsin x, 356 

for arctan x, 350 

for cot x, 386 

for In(1 + x), 353 

for tan x, 379, 386 
Preservation of inequalities, 50 
Primitive function, 218-220, 270 
Product of series, 314 
p-series, 70 


Q 
Quadratic convergence, 189, 192 


Quadrature, 195 
Quantifier, 40 


R 

Raabe’s test, 322 

Radius of convergence, 337 
formula for, 344 

Ramanujan (series for zr), 70 

Range (of function), 111 

Ratio test, 67, 309, 322 


Rational function (integration of), 271, 281 


Rational numbers, 11 

Rational powers, 17, 20, 121 
differentiation of, 139 

Real powers (differentiation of), 251 

Rearrangement of series, 310, 311 


Index 


Reduction formula, 277, 278, 280 

Removable discontinuity, 100 

Repeating decimals, 23 

Riemann—Darboux integral, 196 
definition of, 199 

Riemann’s condition for integrability, 201 

Riemann’s rearrangement theorem, 318 

Riemann sum, 220, 223-227, 229, 255, 284 

Riemann zeta function, 317, 359, 380, 408 

Ring of Power, 124 

Rolle’s theorem, 153 


N) 
Sandwich principle, 55 
Second mean value theorem for integrals, 
232, 234, 270, 271, 412, 413 
Semi-continuous function, 95, 110, 118 
Sequence, 35, 107 
Simpson’s rule, 286 
Small oscillation theorem, 123 
Specification, 4 
Square root of 2, 13 
Squeeze rule, 55 
Stationary point, 151 
Step function, 231 
definition of, 200 
Stirling’s approximation, 391, 415, 417 
Strict local maximum point, 151 
Strict local minimum point, 151 
Strictly concave function, 180 
Strictly convex function, 176-178, 180-184 
Strictly decreasing sequence, 49 
Strictly increasing sequence, 49 
Sturm’s theorem, 173, 174 
Subsequence, 60-63 
Subset, 3 
Summability theory, 360, 421 
Sum (of a series), 64, 307 
Supremum, 26-28 
Surface of revolution integral, 226, 229, 230 
Surjective function (definition of), 119 
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T 
Tangent line, 141, 144, 179 
Tauber’s theorem, 361 
Tauberian theorems, 361 
Taylor polynomial, 156, 165, 166, 367 
Taylor series, 366 
Taylor’s theorem, 157, 166, 187 
Peano’s form of, 157, 389 
with Cauchy’s remainder, 373 
with integral remainder, 375 
with Lagrange’s remainder, 367 
with Schlémilch’s remainder, 377, 380 
Telescoping series, 71 
Transcendental function, 237, 257-260 
Transcendental number, 161 
Transitivity, 10 
Trapezium rule, 284 
Trichotomy, 10 
Trigonometric functions, 245 


U 

Uniform approximation, 231 
Uniform continuity, 124 

Union, 3 

Uniqueness of the real numbers, 79 
Upper integral (definition of), 198 
Upper sum (definition of), 197 


Vv 
Volume of revolution integral, 226, 229 


WwW 

Wallis’ integrals, 279 

Wallis’ product for 2, 394, 415 
Weierstrass, 2 

Weierstrass M-test, 329 
Weierstrass’ theorem, 59 


Y 
Young’s inequality, 183 


